Pages

Thursday, May 26, 2016

Surviving an Avalanche of Genealogical Data

I belong to a rather exclusive club of those who have survived being carried down a mountainside in an real avalanche. So I know first-hand how it feels to be swept along at high speed and be completely out of control. I am also in a position to know, first hand, about the avalanche of genealogical data that is sweeping all of us down my present metaphorical mountain.

I survived my real avalanche experience due, in small part, to some advance knowledge and preparation. But I am not yet sure if I am going to survive the information avalanche or anyone else will either.

Most people do what all sensible mountain climbers do when faced with an avalanche danger, they stay out of the mountains. Most genealogists do exactly the same thing: they ignore the reality of the amount of available information and spend their time picking at the sides of the flow or standing in valley and watching it from a distance.

OK, enough of that analogy and its metaphors. Whether we like it or not or even realize it, we are daily being presented with a huge amount of newly digitized genealogically pertinent data online. But that newly digitized data is only a small part of the huge collections waiting to be added to the pile. I was talking to one of my friends yesterday (yes I do have one or two friends left after writing online for years) about the U.S. National Archives and all its branch archives around the country. Like any other government agency, the U.S. National Archives and Records Administration keeps track and reports about its holdings. Unfortunately, they only report from time to time and the current report is now six years old and dates back to 2010. But even this antiquated report can give us an idea of what they have and what has been digitized. I realized that one way to get an overall idea of the data avalanche was to take a snapshot of one of the largest information institutions in the world: the United States Government.

The statistics from the U.S. National Archives are on linked on a webpage entitled, "Statistical Summary of Holdings." I pick on this one archive for two reasons. One, the U.S. National Archives is huge and two, they do a miserable job of making all their stuff available. What do the archive statistics show?

Before looking at the data and to understand what the statistics are saying, you need to refer to a page of abbreviations (really acronyms) called, "Abbreviations Used in the Statistical Summary of Holdings." This points out an interesting point. Many people visualize the National Archives as the building in Washington, D.C. with the Declaration of Independence. This list, however, paints an entirely different picture.

Units of the Office of Records Services--Washington, DC

NWCT1 Textual Archives Services Division--Archives I
NWCT2 Textual Archives Services Division--Archives II
NWCS-C Cartographic and Architectural Records
NWCS-M Motion Picture, Sound, and Video Records
NWCS-S Still Picture Records
NWL Center for Legislative Archives
NWC Access Programs
NWME Electronic and Special Media Records Services

Units of the Office of Regional Records Services

NR Office of Regional Records Services
NRABA National Archives and Records Administration - Northeast Region (Boston)
NRAN National Archives and Records Administration - Northeast Region (New York City)
NRBPA National Archives and Records Administration - Mid Atlantic Region (Center City Philadelphia)
NRCAA National Archives and Records Administration - Southeast Region
NRDA National Archives and Records Administration - Great Lakes Region (Chicago)
NREKA National Archives and Records Administration - Central Plains Region (Kansas City)
NRFFA National Archives and Records Administration - Southwest Region
NRGDA National Archives and Records Administration - Rocky Mountain Region
NRHLA National Archives and Records Administration - Pacific Region (Laguna Niguel)
NRHSA National Archives and Records Administration - Pacific Region (San Francisco)
NRIA National Archives and Records Administration - Pacific Alaska Region (Anchorage)
NRISA National Archives and Records Administration - Pacific Alaska Region (Seattle)
Other Units of the National Archives and Records Administration
NL Office of Presidential Libraries
NLDDE Dwight D. Eisenhower Library
NLFDR Franklin D. Roosevelt Library
NLGB George Bush Presidential Library
NLGRF Gerald R. Ford Library
NLHH Herbert Hoover Library
NLHST Harry S. Truman Library
NLJC Jimmy Carter Library
NLJFK John F. Kennedy Library
NLLBJ Lyndon B. Johnson Library
NLMS Presidential Materials Staff
NLRNS Richard Nixon Library
NLRNS Richard Nixon Library - College Park
NLRR Ronald Reagan Library
NLWJC William J. Clinton Presidential

Affiliated Archives

USMAU.S. Military Academy, West Point
USNAU.S. Naval Academy, Annapolis

For those of you genealogists out there who are experienced in "working in the National Archives," how may of you have visited all of these facilities and done research in each one? Do you realize that each of these branches has its own unique records?

Now to the numbers. Oh, before we get to the numbers we need to recognize that the records are held in "Record Groups." Here is a list of the Record Groups.

Record Groups 001 - 100
Record Groups 101 - 200
Record Groups 201 - 300
Record Groups 301 - 400
Record Groups 401 - 500
Record Groups 501 - 581

The links for each record group will take you to the webpage for the statistics for that group and there is more.

Alphabetic Index of Donated Materials Groups
Statistical Summary of Donated Materials Collections

One other thing I need to mention, all of the statistics from the National Archives are in cubic feet of records, not the number of individual records. How may records are there in one cubic foot of records? Here is the answer from the National Archives.
The quantity of records in the custody of each unit.
  • The quantity of paper-formatted textual records is expressed in cubic feet only.
  • The quantity of microfilmed textual records is expressed both in cubic feet and in items (number of microfilm rolls, according to size and polarity; number of microfiche cards).
  • The quantity of nontextual records is expressed both in cubic feet and in items appropriate to each medium.
  • The quantity of artifacts is expressed both in cubic feet and in items.
Now do you really expect me to add up all of the numbers? Well, I am not going to. But I can give you an example of one tiny part of the list in one tiny section of the National Archives.

0015 Records of the Department of Veterans Affairs
TOTAL: 82,094.647 cu. ft. 246,570 items

By the way, these records are summarized as follows:
A record is evaluated. The creator of a record proposes to the National Archives how long it should be kept. Some records are destroyed (for example, a receipt for the purchase of pencils), while others are kept permanently in the National Archives (such as executive orders). Records schedules are set up to determine how long all Federal records are to be kept by the Government. Only 1–3% of all records are kept permanently, but the total number of documents in the National Archives number in the billions, and the number keeps growing.
How many of these records are available online? The list is actually available and is current as of January 11, 2016 on the page entitled, "Microfilm Publications and Original Records Digitized by Our Digitization Partners." The actual number of digital records is a vanishingly small percentage of the total number of records but the list is very long.

Now, think about it. This is just one archive. There are hundreds of archives across the United States and thousands around the world. There are tens of thousands of libraries, universities and colleges with record collections. There are thousands of private collections of documents. There are billions upon billions of records online and more going on each day. 

When you are in an avalanche, can you really say you have looked at every snowflake? Please do not tell me you have searched everywhere for your ancestors. You have no idea how many records there are in the world, much less the United States. 

Can we survive? Well, the answer is yes as long as we don't get hung up on riding along at high speed. Perhaps I can help and maybe those who think they know it all can take a step back and admit that they are just barely able to focus on one small part of the total number of records and get on with doing more research. Perhaps, the next time you hire someone to do "research for you in the National Archives (or anywhere else for that matter) you realize that they are probably only talking about the main Washington, D.C. facility and would need more money and more time to search everywhere else. 

No comments:

Post a Comment