Revision History for the PLDF3 Dataset
The FY 2016 IMLS report had 9,252 observations. Of those 18 STATSTRU = ‘03’ (Closed) or ‘23’—(Temporary Closure). They were dropped from PLDF3. PLDF3, then, has 9,234 observations added from the FY 2016 data. The total number of observations (one library's data for one year) in PLDF3 is 266,746.
Libraries from two “outlying areas” are included: American Samoa and Guam. Puerto Rico reported no data in FY2016. It had reported steadily from FY 2008-FY 2014 but has stopped.
10,174 libraries have ever reported in this series,
There was one library an FSCSKEY anomaly which required one a in its newkey:
- Prairie County Library in Hazen, Arkansas reported data in 1988 and 1989 then did not report again until 2016.The first two year's data had an FSCSKEY of AR0034 while the FY 2016 data were in AR0083. The three years data were given the newkey of AR0083. The code changes are detailed in the Arkansas Schedule of Changes
There were 9,251 observations in the FY 2015 data from IMLS. 18 had STATSTRU = ‘03’ (Closed) or ‘23’—(Temporary Closure). They were dropped. PLDF3, then, has 9,231 observations added from the FY 2015 data. 248,281 observations (through FY 2014) + 9,231 = 257,512 observations.
Libraries from two “outlying areas” are included: American Samoa and Guam. Puerto Rico reported no data in FY2015.
10,157 libraries have ever reported in this series,
There were two libraries with FSCSKEY anomalies and changes in newkey:
- Grantsville (Utah) City Library in the FY 2015 data has a FSCSKEY of UT8003. In the FY 2014 data it is the Grantsville Public Library which has an FSCSKEY of UT0075. The address is the same so I created a newkey for both of UT8003. The code changes are in the Utah Schedule of Changes. I suspect the creation of this library is related to the fact that the TOOLE COUNTY BOOKMOBILE LIBRARY (UT0053) stopped reporting with the FY 2012 data. It reported each year from FY 1987 through FY 2012, listing Grantsville as the CITY where it was located.
- West Hartford, Vermont's library reported data from FY 1989 through FY 2011 using FSCSKEY VT0199. It reappeared in FY 2015 in FSCSKEY VT0222. The Schedule of Changes for Vermont has the code created a newkey of VT0222 for the earlier years.
There are 9,305 observations in the FY2014 dataset. Ten of those observations had STATSTRU = ‘03’ (Closed) or ‘23’—(Temporary Closure) and were dropped. The Virgin Islands and the Marianas did not report so those two were also dropped leaving 9,293 libraries to be added to PLDF3. The number of observations in PLDF3 (one library's data for one year) is now 248,281.
10,144 libraries have ever reported.
58 Puerto Rican libraries reported in 2014, up from the 34 that originally reported in 2008. Total number of observations from Puerto Rico is now 358.
The FY 2013 data file from IMLS has 9,309 observations. 19 of those libraries have STATSTRU = ‘03’ (Closed) or ‘23’—(Temporary Closure.). After changes with the libraries from outlying areas, 9,288 libraries added to PLDF3 this year. The count of libraries with data included in PLDF3 is now 238,988.
The raw data file for FY 2012 from IMLS for the FY 2012 data has 9,305 observations (one observation per library.) Of those, 11 have STATSTRU = ‘03’ (Closed) or ‘23’—(Temporary Closure.) Other changes with libraries not reporting from outlying areas leave 9,292 libraries in the FY 2012 data in PLDF3. There are now 229,700 observations in PLDF3. There are now 10,112 libraries which have ever reported data in this series.
This year there are 252 libraries from Puerto Rico an impressive increase in their reporting.
This series gets better and better. Thanks to IMLS for the continuing and building on its strengths
The raw file for FY 2011 from IMLS has 9,315 observations or one for each library. Of those, 24 have STATSTRU = ‘03’ (Closed) or ‘23’—(Temporary Closure.) They are dropped from PLDF3, thus leaving 9,291. The Virgin Islands and the Northern Marianas did not report and are also dropped, leaving 9,289. There are 56 libraries from "outlying areas." One from Guam and 55 from Puerto Rico although a number of these Puerto Rican libraries have fragmentary data in this file. 9,289 + 211,119 = 220,408 observations total for the file. A total of 10,092 public library systems have data in this series.
The documentation has a note that there were 55 Puerto Rican libraries of which 35 reported. (p. 3) The raw file has 61 libraries in Puerto Rico but 3 have a STATSTRU of ‘03’ or '23.' Therefore, there are 55 (in the file) minus the 35 the documentation says reported leaves 20 libraries not reporting but with data in the file.
That is, there are data elements included and many of them, I suspect, are supplied not by the libraries themselves but by officials of Puerto Rico from Commonwealth-wide data. These data include such things as latitude, longitude of these libraries, and so on. But in these data, we see that almost all of these libraries with the fewest reported data elements are very small, with few staff, fewer of these staff with ALA-accredited degrees, low total circulations, and most other resources are meager. It is a bit of a puzzle about what to do. I have decided to include all. For now.
I have often thought about the matter of what is the obligation of the compiler of data? I think one principle is what I have termed the Hippocratic Oath of the data compiler: first, do no harm. In the case of our colleagues in Puerto Rican libraries, we see small libraries from which some data can be gleaned and given this is a longitudinal file—with all the baggage that fact entails—I think the lesser of many evils is to include these libraries and await further information in the years ahead.
One minor anomaly came to light but this is something I missed in last year's data. From 1988 to 1995 the Koyukuk Community Library in Koyukuk, Alaska reported data under the FSCSKEY AK0042. The library did not report from 1996 to 1998 but began, again, where its data were published from 1999 through 2004 and 2006 with the FSCKEY AK0119. Then it did not report for 2007 and 2008, reappearing in 2009 and 2010 but now back in the old FSCSKEY of AK0042. The newkey, therefore, is now AK0042 and the library's data now fall together there. The Alaska Schedule of Changes has the code changes.
The raw file from IMLS has 9,308 observations. Of those, 9 have STATSTRU = ‘03’ or ‘23’—closed. There are two observations made up of completely imputed data for the Marianas and for Palau. Thus, the number of observations in PLDF3 is 201,822 (the count through FY 2009) + 9,297 (9,308 - 11) or 211,119. Unlike in the past, there were no anomalies in earlier data which were noticed in the process of adding the FY 2010 data, so this is the count. There were just a few changes necessary in newkeys and those were all cases where a library had reported in the past and hadn't reported for a few years. When the libraries reported in FY 2010, they were assigned new FSCSKEYs. As in the past, the newkeys were made to match with the current FSCSKEYs.
Just to take a step back, the data, their documentation, and the care with which they are compiled have continued to improve. This is a most impressive series of data and one which reflects the hard and dedicated work of many people over many years.
There are now 201,822 observations in PLDF3. The FY2009 data had 9,275 open libraries reporting data. Data from one library were dropped. It had observations in 1987, 1988, and 1989. That means the revised total for the data through FY2008 is 192,547. 192,547 + 9,275 = 201,822.
The deleted observations are discussed in the Schedule of Changes to Idaho newkeys.
The count of observations in PLDF3 through FY2008 is 183,293 + 9,257 = 192,550 with 110 variables.
The base IMLS Public Library data have 9,284 public library systems reporting. Of those, 25 are library systems which have a "Structure Change Code" or STATSTRU = "3" or "Closed." This is the first year this value has been reported for this variable and the 25 libraries that are included in the base dataset in this category are excluded from PLDF3. 9,284 - 25 = 9,259.
Of these 25, 12 are in South Dakota.
The Northern Marianas and the Virgin Islands, again, did not report data and are, again, also excluded from PLDF3. 9,259 - 2 = 9,257.
For the first time in this data series, the public libraries of Puerto Rico have reported. There are data from 35 new systems to be added to this series. Bravo!
This data series continues to show improvements on many levels. It is the best public library data series we have and it has taken dedicated work of many people over many years. Here's hoping the analytical community can do its part to help practioners.
The count of observations for PLDF3 through FY2007 is 174,079 + 9,214 = 183,293 with 108 variables.
As discussed at 12/08/07, NEWMAN TOWNSHIP LIBRARY reported twice. Rob Winner, Illinois' PLSC Data Coordinator confirmed that the data in FY2006 were reported twice and that the data in IL0377—the number used since 1988—were partial year data and that the new number, IL0694 had the full year's data. In FY2007, the data for NEWMAN TOWNSHIP are reported in IL0694 and IL0377 does not exist. For this reason, the FY2006 data are corrected in PLDF3 and the partial data for IL0377 are dropped. The data for NEWMAN TOWNSHIP will now have the newkey at IL0694. The FY2006 data, therefore, now have 9,207 observations. As noted immediately below, the three outlying areas did not report so PLDF3 for FY 2006 has the reported 9,211 - 1 (Newman Township at IL0377) - 3 (Guam, Northern Marianas, and the Virgin Islands) = 9,207. Thus, PLDF3 through FY2006 has 174,079 observations.
The FY2006 data have 9,211 observations. As in the recent past, three outlying areas: Guam, The Northern Marianas, and the Virgin Islands did not report. These observations have been dropped in PLDF3, leaving this year with 9,208 observations. PLDF3 now has 174,080 observations (9,208 + 164,782) with 101 variables.
For the most part, the data show steady improvements in the sense that data from this year are more aware of data from the last year so that anomalies caused by incorrect changes in FSCSKEYs are lower than ever. The most common such problems occurred in two states where libraries that had reported for a number of years missed FY 2005 and when they reported again, were given a new FSCKEY. These cases are handled in the FSCSKEY by changing the newkey in all other occurrences of these libraries to the latest FSCSKEY.
IL0377 has been used since 1988 for NEWMAN TOWNSHIP LIBRARY, 108 WEST YATES STREET, in NEWMAN, ILLINOIS. With variations in the libname and address over the years. This year there is IL0694 which is at the same address and phone number. It reports data that differ slightly from IL0377. The name is also slightly different. It looks to me like the person filling out the form put data in each. I think IL0694 has later data but not in all cases.
In the process of testing the ASCII file of the data (pldf3ascii.txt) with the SAS dataset, I found two duplicates: the first was for ME0242 for 1991 for which there is a complete entry and a second that appears to be a stub entry, that is, it is incomplete. Also, ME0256 for 1991 but these duplicates appear both to be complete. The stub entry for ME0242 and one of the entries for ME0256 are deleted. The count of institutions in PLDF3 is now 164,872.
My colleague, Don McMorris, found the solution to a problem that I had noticed but not understood. The 1987 data on employees were completely out of range of all other such data. Embarassingly, he found it in the documentation for the 1987 data: the documentation make clear that the raw data for that year did not have an explicit decimal point. I missed that when I first read in the 1987 data so the number of employees in four categories were ten times the true values. That is fixed, now. There is more information on this correction in the discussion of the PLDF3 variables
The FY 2005 dataset has 9,201 observations. However, three outlying areas: Guam, The Northern Marianas, and the Virgin Islands did not report. These observations have been dropped in PLDF3, leaving this year with 9,198. There are 164,874 observations in PLDF3 now with 94 variables ever reported in the dataset—however, not all variables are reported for each year.
In reading Douglas Galbi's Audiovisual Materials in U.S Public Libraries' Collections, I noted that he reported he had trouble reading in the ASCII file of the PLDF3 data. He pointed out that the documentation of that file differed from the output program. He was right and I appreciate his reporting that problem.
At first I thought it was merely a documentation error but in checking, I found another kind of error that I had missed. In five cases, four of them dealing with staffing (MASTER, LIBRARIA, OTHPAID, and TOTSTAFF), the output program had not output numbers with explicit decimal points but, rather, had rounded the numbers. The fifth (PSUNDUP) is only reported from 1987-1989 in these data. This error is not reflected in the spreadsheets, only the ASCII file. The documentation and the ASCII file are updated today, September 7, 2007. This update is an iterim because we have been led to expect the FY 2005 data will be published by NCES in October.
I believe that, if anyone used this ASCII file to analyze staffing at US public libraries, the effect of this error would be greatest in the very smallest libraries. That is, if a library had .25 total staff, the ASCII file recorded 0. If it had 100.25, the file reported 100 so this rounding would appear to affect smaller libraries more than the larger ones.
Those interested in analysis of library data will find Mr. Galbi's other work of interest. See: Library and Library Use for a list of these articles on his Web page.
The Virgin Islands did not report in FY 2004 so it has been dropped from PLDF3.
The Northern Marianas did not report in 2001, 2002, nor 2004 and it was also dropped from PLDF3.
Guam did not report in 2002 nor 2004 so these observations were dropped from PLDF3.
The number of observations in PLDF3 is now 155,676.
Dropping these observations has affected the counts of observations for each year. There are now 9,207 observations in FY 2004 instead of 9,210.
The FY 2004 data had 9,210 observations. That number plus the revised number of observations of 146,472 means that PLDF3 currently has 155,682 observations. Breakdowns by state and year of these observations is available in a spreadsheet.
Price City Library, Price Utah (newkey = UT0017) had duplicate entries for 1991. They differed only in the name. One was Price City Library and the other was 'DAGGETT CO. BOOKMO' and this one was deleted.
The number of observations for PLDF3 through FY 2003 is now 146,472.
In adding the FY2004 NCES public library data to PLDF3, three duplicate entries were discovered:
- In 1988 there were two entries in PLDF3 for (newkey) NH0237 (Acworth Silsby) that were identical so one was dropped. In the original 1988 data file, these data appeared in two different FSCSKEYs: NH0001 and NH0002. This duplication was missed before and one was dropped.
- In 1990, there were two entries in PLDF3 for NH0230 (Milan Dummer). This duplication appears to have occurred because a stub entry was created in NH0229 that had basic information on address and so forth but no data. The full entry for that year was in NH0230. The NH0229 was dropped. The data for 1989 were also in NH0229. There are no data for this library in 1988.
- A complicated duplicate was discovered for data from Indianola Public Library in Nebraska (NE0230). Its resolution is discussed in the Nebraska Schedule of Changes.
There are now 146,473 observations in PLDF3 for FY2003.
The FY 2003 dataset had 9,214 observations. That, plus the 137,262 from 1987-2002 gives us the current 146,476.
Palau has entries with no data in 2001 and 2002. These have been dropped. The count of observations is now 137,262.
One library was dropped from the 2002 NCES data, IL8023. This change is discussed with changes from Illinois. The number of observations in the PLDF3 dataset is 137,264. This includes the 9,140 observations added in 2002 with the 128,124 from 1987-2001.
Minnesota did not report data in 2001 but NCES published records with information on the libraries. After the imputations were removed, these became dummy records and the 140 from 2001 were discarded from PDLF3. The number of observations is now at 128,124. No Minnesota libraries will have a span of 'A' because it seems that the data for this year are lost for Minnesota public libraries..
Indiana's zipcodes in 2001 were incorrect in the original publication of these data. NCES has updated the dataset and the various datasets here reflect the change.
There were duplicate records for the Nebraska towns of Alma (FSKSKEYs NE0006 and NE0258*), Fremont (NE0090 and NE0259*), Genoa (NE0094 and NE0260*) in 1988. In addition, Creighton (NE0062 and NE9014*) was duplicated in 2000. The asterisks indicate the observations that are deleted in PLDF3. More details on these four cases are found in the Nebraska Schedule of Changes.
There were two records for Walhalla, North Dakota in 1990, one in ND0086 and the other in ND0095. The record in ND0086 was mostly blank and appears to have been a stub record that was not filled in. ND0095 had data in it. ND0086 for 1990 was deleted in PLDF3. The number of observations is now 128,264.
In Kansas in 1991, there are 17 dummy records (KS0323 through KS0339). They are deleted in PLDF3 bringing the number of observations down to 128,269.
There are two sets of data for Elmira, New York for 1988. One has an FSCSKEY of NY0100 and the other NY0101. In comparing the two, it appears that the data in NY0100 are incomplete and appear to have been a mistake. For instance, in 1987 and 1989, there are four branch libraries while in the NY0100 data, there are 0. In fact, there are many 0s in this set of data. The years 1987, 1989-1990 have an FSCSKEY of NY0101 and the data behave reasonably year to year if the 1988 NY0101 data are used. The NY0100 data are deleted in PLDF3. In 1991, the FSCSKEY for this library was changed to NY0765 and its newkey for all years is now NY0765.
The count of observations is now 128,286.
The data for Wedsworth Memorial Library in Cascade Montana for 1991 was duplicated. the second copy has the LIBNAME of Liberty County Library and the ADDRESS of that library but all other items are otherwise identical to Wedsworth.
The data for Brownsville Free Public Library, Pennsyvania, for 1990 was duplicated and this duplication is continued through PLDF2. It is FSCSKEY PA0001.
July 23, 2018