RootsTech 2014


Some people eat, sleep and chew gum, I do genealogy and write...

Friday, October 5, 2012

Dealing with Monster Files

There are two extremes in genealogy, those who start out trying to identify their own parents and those who "inherit" a pedigree with thousands upon thousands of "relatives" listed. Both ends of the spectrum have their unique challenges, but I find very little written about how to approach the situation of the monster file. I fully understand that there are many genealogists out there that wish that they had the problem of the huge file, but I can assure you, it is a challenge and very intimidating to the beginning genealogist.

The first thing to over come is the tendency to believe the family tradition that "grandma" did all the family's genealogy. Numbers don't mean a thing unless there is a substantial level of source citations and background. I recently mentioned looking at four extremely large files with four different people. In each case, if found substantial inconsistencies and likely wrong information within a few minutes of examining the file. Another thing to remember is that the person who compiled all that "genealogy" was not related to all of your own lines. It would be extremely unusual to have active genealogists on every one of your family lines, not impossible, just very, very unlikely.

The next hurdle to overcome is intimidation. What could you possibly contribute to all those thousands of names? It is true, large genealogy files are intimidating. But sometimes the numbers turn out to be an illusion. Lines the go "back to Adam" may have a huge number of names and no substance. In addition, as happened in my case, I inherited a huge file that contained more than 18,000 names, but large numbers of the people listed  were not actually my relatives. The original researcher had simply listed any one in the jurisdiction she was searching with the same surname, relative or not.

The first step in coping with a huge file is to take the information one step at a time. Do not try to jump back and start with the first "interesting" person you find. Methodically go through each individual until you are both familiar with the information (dates, names, places) and comfortable that the information is accurately recorded. Always, always look for inconsistencies, such as childhood marriages and deaths before births. If the file is on a disk, rather than paper, always import the information into a genealogical database program, hopefully with the tools to handle inconsistencies.

Doubt everything. Just because information is recorded or even printed in a book does not mean it is accurate or even reasonable. I have found an unfortunate tendency in large pedigrees to try to prove a relationship to a famous person. In my own lines these supposed ancestors included relationships to Daniel Boone and J. P. Morgan of bank fame. It took very little research to disprove both assertions. In one case, a Great-great-great-grandmother was claimed to be a princess in Great Britain, also not true. However, doubt can be resolved with additional research. Just think, the story might be true!

As you begin to become acquainted with each of the families in a huge database, the whole project will take on an entirely different perspective. As you look for loopholes and inconsistencies, you will begin to see what is and what is not in the file. If you find that few, if any, of the names and events listed have concrete sources, your first challenge will be obvious, adding sources for all of the family members. As you do so, you will find even more things to research.


  1. I was one of those people who was fortunate to inherit a modestly large genealogical database (about 7,000 entries), but unfortunately none include sources for the facts they assert. So, much of my work has involved sourcing the information contained in the database - validating some claims, and invalidating others - and occasionally extending a branch out here or there when I uncover relevant information. Your recommendation to "doubt everything" is the only approach to take from the standpoint of serious research.

    One struggle I wrestle with now is how to segment off the information that I have documented from that which I have not in order to publicly share much of the sourced research. I have no interest in sharing unsourced claims, but I have found my genealogy software (one of the major players) is hamstrung when it comes to intelligently exporting a subset of a file in order to allow me to create a new file of only substantiated research. In fact, the butchering of source information in various export attempts was so appalling that I have resorted to simply creating an exact copy of the entire database, and am now sifting through that copy name-by-name and deleting those individuals for whom I have no sources. It is a painful exercise on several levels: the sheer monotony of the manual labor involved; the disappointment of pruning away entire branches of the file, as well as numerous individuals, to whom I am likely related; and the heartache of seeing the incredible mountain of work done by others that I cannot treat as credible due to the lack of any sourcing. Granted, I have kept the original database intact to use as a reference, a source for many leads in the past and no doubt many more in the future.

    My biggest mistake was integrating my research into that original file for several years before I finally felt compelled to segregate it back out. If I had started my own, separate file from day one, I wouldn't have this challenge now.

    If you or your readers have any thoughts on how to approach this challenge, I would be interested to read them.

  2. I have a number of large files going back to no one particularly important, but still no way to really document most of them, and am beginning to see other people's family trees on Ancestry with other ideas, but still no particularly good sources. One that I found this morning had my ancestor going back to the Mayflower. That could be true, I suppose but I sure haven't found it and the information that I have documented does not jibe with what she has. Annoyed.