Some people eat, sleep and chew gum, I do genealogy and write...

Thursday, February 14, 2019

Who are the holdouts to digitization?



Digitization of valuable genealogical information has been engulfing us like a gigantic wave. But it always helps to step back and think about the vast number of records still locked up in paper documents. The number of documents waiting to be digitized probably exceeds by many times the number of useful digital images that are already online. After spending time digitizing records at the Maryland State Archives and visiting both the National Archives and Library of Congress, I have an enhanced appreciation for the number of documents that remain unavailable online in digital format.

What remains to be digitized? The list could go on indefinitely. But there are some notable holdouts. I am sure that the same situation exists in most, if not all, the countries of the world, but my focus here in this post is on the United States.

First of all, we have our own U.S. National Archives. I could ask what would seem to be a valid question about the number of records presently in the U.S. National Archives, but persistent research online has shown me that apparently, no one has any idea of the actual number. Estimates begin at about 12 billion but the actual number is probably unimaginably larger. number. With very, very, few exceptions, the number of documents is entirely unknown. The documents are not cataloged individually but are classified in the number of cubic feet of documents or the number of linear feet on the shelves. Here is a screenshot of an example from the following record group:

Reports of Indian Schools, 1941 - 1949
Creator(s): Department of the Interior. Bureau of Indian Affairs. Navajo Service. 1947-1949  (Most Recent)
From: Record Group 75: Records of the Bureau of Indian Affairs, 1793 - 1999

Here is the screenshot:

This is only one small set of records. Most of the other records are measured in cubic feet of documents. If you spend even a few minutes in the National Archives Catalog you will begin to see the challenge.

Here is the statement about digitization currently on the National Archives website:
With NARA’s strategic plan, 2018-2022, NARA has committed to digitize 500 million pages of records and make them available online to the public through the National Archives Catalog by October 1, 2024. This goal will be accomplished, in part, by integrating digitization into the responsibilities of archival units nationwide and through entering into new public-private digitization partnerships.
Hmm. Let's suppose that the number of documents held by the National Archives is really about 10 billion (a really low estimate) what percentage of the documents does 500 million represent? Additionally, how many documents does the U.S. National Archives receive each year? My guess is that it is somewhere above 500 million. So really, they are getting further and further behind. By the way, 500 million is less than one half of one percent of the estimated total. 

The National Archives actually does almost no digitization of existing records. The strategy of the National Archives is set forth in a rather long discussion on a web page entitled, "Strategy for Digitizing Archival Materials." If you have ever tried to find a document in the National Archives, READ THIS PAGE. As an attorney, I am used to reading all sorts of government stuff. What this page essentially says is that the U.S. National Archives is doing essentially nothing to digitize its holdings. But what about the Partner Programs? There is a page of the "Digitization Partnerships." If you are at all interested in what is going on at the National Archives look at this page.

What is not on the page is that the actual digitization of records is mostly at a standstill. For example, what I have learned by talking to FamilySearch.org volunteers (Missionaries) who served at the National Archives digitizing documents is that because of the current administration's "budget cuts" all digitization has essentially stopped. In addition, because of bureaucratic red tape and outmoded preservation procedures, the process of digitization inside the National Archives is essentially nearly impossible for the volunteers. For example, Ancestry.com has a list of its "New and Updated" collections on its website.



The most recent records added are from October 2018. But look at the list closely and you will see that none of the recent records seem to originate from the U.S. National Archives. If you look at a few categories that likely came from the Archives, you will see that the collections are really quite old and not identified as coming from the National Archives.

This is just one example of the massive number of records that are not only unavailable in digitized copies online, but are not even in the process of being digitized. Despite everything that I do say about what is available online, we all need to remember that genealogists still need to visit these archives and libraries and do the rest of the research that cannot be done without digital copies. 

2 comments:

  1. Hi James, hope you are well. Do you have any suggestions on ways to persuade an institution to digitize a record set? I know of a publication held on microfilm only by one institution that could likely benefit a lot of researchers (a German language newspaper in an area settled by a large number of Germans). The films need to be digitized and indexed.

    ReplyDelete
    Replies
    1. Thanks, I am well. Please send me the name of the newspaper and I will see if it is already digitized but then I will need to know who holds the copies, how many copies or years are there on paper and a few other questions. Is the institution holding the copies willing to have the newspaper digitized? All of these and a few more questions need to be resolved. Thanks for writing. genealogyarizona@gmail.com

      Delete