Friday, September 6, 2013

What Source Records Are Left to Digitize?

There are almost constant notices of additional records added to the flood of records already online. So, for genealogists, the question should be: what is left? How many records are there that remain locked up in paper or other media that are not available on the Internet? Some estimates put the number of records digitized at a very small percentage of all the records there are in the world, but is this realistic? I decided to investigate and see if there were any realistic appraisals of the total number of records in the world and further, if there was some indication of the percentage of those records that have already been digitized.

The question that arises is whether a search on the Web itself can be "reasonably exhaustive" or if a failure to include paper records in a search automatically dooms it to insufficiency?

How does anyone determine how many records there are in the entire world that might be useful for genealogical research? After years of research and some reflection on the answer to that question, I would have to say that unequivocally, there are many more records left to digitize than have already made their way to online availability. I can show this, not by any actual numbers, but by the type of records yet unavailable in any quantity.

Let me take for example the U.S. National Archives. It would seem that the digitizing effort would have made some progress in that archive alone. But it takes only a few minutes of searching on the National Archives website for documents to realize that only a very small percentage of them are presently available online. Other countries may be either more or less successful in converting to online access, but presently, searches for documents provide slim online pickings.

Another category of records with little online presence, is the vast number of paper records archived in university library special collections. I can frequently find a lifetime of records accumulated by an individual that are entirely unavailable without a visit directly to the library in question. For an example, see the list of items stored by the University of Utah Library, Special Collections of my former professor, Wick R. Miller.  Professor Miller was not as prolific as some, but few of the documents listed in this huge list are available directly online and there is no indication of the level of digitization.

Another huge category of missing digitized records are those in small, local repositories. This includes millions of cemetery records, including Permits for Burial, and many other related records such as mortuary records.

In some cases, state archives have been successful in converting some of their records to digitized copies. A good example is the State of Washington. The Washington State Digital Archives currently holds 49,473,734 records online but has 136,350,901 searchable. My guess is that this is the largest percentage of digitized records of any state in the U.S. but my further guess is that second place has a much smaller percentage. Even if we were to assume that all 50 U.S. states had similar collections (which they certainly do not) this demonstrates that the amount of digitized records, although not insignificant is far, far from complete. 

Another example is the collection of microfilms held by Although millions upon millions of records go online almost every day, the most recent reports on the percentage actually digitized is much less than half. It is not clear from public statements, how many of the digitized records have made their way online. Some of the issues that prevent records from making their way online include the refusal of some of the originators of the records to give permission for public access. 

Another huge number of records are "classified" by one government or another around the world. These documents are certainly not available. 

Of course, we need to mention all the records of developing countries or those countries who haven't even made it to the level of developing. Very few of their records have made it online as yet. 

I think these are enough examples to show that those who believe that we have a long way to go before a reasonably exhaustive online search can ignore paper records will even fall into the category of possible. So all you out there that are thinking you are done with your research when you haven't explored local and state repositories need to get to work. Sorry folks. 

1 comment:

  1. You are correct a lot of records not online, I live in Washington and have been helping digitize records for almost a decade now, but even though a lot is online there are counties that have yet to send their old records to the archives, so even if we digitize all the records in the archive there will still be some you will need to go to the county to search.