Some people eat, sleep and chew gum, I do genealogy and write...

Wednesday, July 8, 2009

The challenge of large genealogical databases

The spectrum of the status of genealogical research is very broad. I encounter people who have a few names on a scrap of paper to those who have files containing tens of thousands of names. If a person has no information about his or her family, the next steps are pretty obvious. Generally, some progress in finding family members can occur almost immediately with a little effort. But what do you do with a person who inherits a huge database file of names?

Regularly, the following scenario occurs at the Mesa Regional Family History Center. Someone comes into the Center and asks for help with a file they just received from a relative. Usually, the file is in either a GEDCOM or Personal Ancestral File format. Upon opening the file I find that there are tens of thousands of names. For example, a recent file had 38,000 plus names. The experience is sort of like being dumped into the middle of the ocean in a small boat.

In making these comments I am in no way depreciating the efforts of researchers who have carefully accumulated huge files. However, if I see one more file tracing ancestry back to Charlemagne I think I will scream.

Here are a few suggestions if you find yourself or are helping someone in those circumstances:

1. Look at the file to see if there is any citation to sources. Without source citations, the information cannot be easily verified. It is possible that the file is merely a conglomeration of files downloaded directly from the Pedigree Resource File, the Ancestral File or some similar database. I usually take some time to explain to the person that without source information the file is not really what it appears to be and is probably unreliable. I must comment, however, that the explanation seldom has any effect on the person with the file.

2. Ask the person with the file if they know about or recognize any of the people. You have to start somewhere and the first issue is whether or not the file pertains to the person who has it on their flash drive or whatever. I have found that sometimes the person with the file has no idea who the people are. If this is the case, disregard the huge file and start with the person just as if they had no information at all. They need to work with their family until they can relate to someone in the huge file.

3. Look at the file to see if even makes sense. Too often, large files are full of duplicates and unrelated individuals. In an inherited file, the owner will unlikely know where or how the relationships were established. This is where they must start. There is no way to make any progress on a huge file without knowing the core family members and how they are related. Sometimes I ask the person to name from memory his or her eight great-grandparents. Sometimes that helps them to see the problem of relationships.

4. Try to visualize the scope of the file. Who are these people? This should be one of the first questions. Sometimes, the person with the file does not even appear as a member of the family. That is a good place to start. Try to connect the individual with the family in the file.

Once you have a chance to get an orientation about the validity and relationships in the file. Suggest the the person set the file aside and become acquainted with his or her own four generation family and start documenting the sources for events. Unfortunately, I have found that possessing a huge file will often discourage most of the budding researchers. They either believe that everything has been done or they are overwhelmed with numbers.

To be continued.


  1. Thank you for posting these points.

  2. Getting beginners with large handed-down files to stop and consider and explore their sources is the hardest part. They're all excited to see this great collection of names and do not understand the need for verification and analysis.

    This seems to be the nexus where the winnowing process takes place, weeding out those who are not going to be wiling to do the work. It is unfortunate, because these will be the ones who will also not stick around to reap the rich rewards!