Some people eat, sleep and chew gum, I do genealogy and write...

Thursday, September 6, 2018

What Percentage of the World's Records Have Been Digitized?


It has been a few years since I wrote about the percentage of the world's records that have been digitized. But I have become far more aware of what is involved in analyzing the remaining records that still need to be digitized because of our work at the Maryland State Archives. The number of records now online is truly phenomenal. But the number left to be digitized is astronomical.

There are huge record repositories here in the Washington, D.C. area such as the Library of Congress and the National Archives. In addition, close to where I am presently living, the is the Maryland State Archives and in Richmond, the Library of Virginia. If you are standing in a huge library or archive, the number of records can seem overwhelming. Likewise, when you go online and see websites claiming to have billions of records, it might seem like those numbers alone indicate the extent to which the all of the records that exist are on their way to being digitized. I guess my analogy would be that if you were standing on edge of the Arctic Ocean, you might think the whole ocean was covered with ice, but if you stepped back a bit from a satellite view, you would see that ice only covers a small percentage of the world's oceans. Talking about the digitization of historical records is like that. When you are searching online the number seems endless until you see what is left to digitize in just a few large libraries or archives.

Determining the total number of genealogically significant records that exist in the world is likely impossible. But it easy to see that many billions of records are still not digitized by looking at the percentages of records digitized from just the Library of Congress and the National Archives. The Library of Congress has the following from their website.
The Library of Congress is the largest library in the world, with more than 167 million items on approximately 838 miles of bookshelves. The collections include more than 39 million books and other printed materials, 3.6 million recordings, 14.8 million photographs, 5.5 million maps, 8.1 million pieces of sheet music and 72 million manuscripts.
 But if you look carefully at the Digital Collections, you will see that only a vanishingly small percentage of the total number of items in the Library have been digitized. As far as the National Archives collections are concerned, even the National Archives does not know the number of records. See Statistical Summary of Records Holdings of the National Archives of the United States. Here is what I mean, quoting from the Statistical Summary page.
The quantity of records in the custody of each unit.

  • The quantity of paper-formatted textual records is expressed in cubic feet only.
  • The quantity of microfilmed textual records is expressed both in cubic feet and in items (number of microfilm rolls, according to size and polarity; number of microfiche cards).
  • The quantity of nontextual records is expressed both in cubic feet and in items appropriate to each medium.
  • The quantity of artifacts is expressed both in cubic feet and in items.
The digitization efforts of the National Archives are summarized on this webpage, Strategy for Digitizing Archival Materials, Strategy for Digitizing Archival Materials for Public Access, 2015-2024. It appears from that webpage, that the National Archives has about 2 billion records digitized. This number is less than the number of records on any one of the large online genealogy database websites. Presently, FamilySearch does not have an active digitization project at the National Archives.

These two examples show that any ideas that "everything has been digitized" are woefully wrong.




No comments:

Post a Comment