Some people eat, sleep and chew gum, I do genealogy and write...

Tuesday, March 17, 2015

The Earliest Real Records - Extending Family Lines Fact or Fiction?

When we talk about genealogy are we talking about something real or imaginary? This may seem to be a strange question, but when I get into some of the pedigrees (family trees etc.) that I have seen lately, I am beginning to ask the question more and more frequently. How accurate can any of these entries be if there are no sources?

OK, now there is a substantial backlash to this question. The first level of response deals with the relationship between "citing a source" and the reliability of the both the source and the researcher's ability to evaluate the reliability and extract the correct information and all the information's inferences. A simple illustration would be a U.S. Federal Census record. An experienced researcher would realize that information from the U.S. Federal Census was not particularly reliable for facts such as the spelling of any of the names, ages, birth information, countries or states of origin of the individuals, year of immigration etc. The record made by the Census Enumerators is, at best, second-hand and orally transmitted and is decidedly unreliable. All you really have to do to illustrate this is to view several different U.S. Federal Census years for the same family and compare the records.

My basic rule in this area is simple: any information obtained from any record is only as reliable as the record itself. We can get into a huge discussion about primary, secondary and so forth as applied to sources or records, but what it boils down to is that historical records, including all genealogical records, are inherently unreliable and incomplete. This does not mean that the information is not "correct," it merely points to the fact that the transmission of historical information is an imperfect process.

This brings me to the area of error checking. One of the earliest concerns in the development of electronic computers was the issue of checking the reliability or accuracy of the data packets. See Wikipedia: Error detection and correction. Now, what is the error detection and correction process in genealogy? THERE ISN'T ONE. Individuals could and sometimes do, implement their own procedures to insure greater reliability, but right now, any such error detection and correction procedures are entirely left up to the individual researchers who are not in the habit of doubting their own abilities and judgment. Experienced researchers are on the right track when they cite multiple sources.

How did the computer engineers solve the error detection and correction problem? Redundancy. They realized that if they sent the same information two or more times and then checked to see if the they had the same amount of information, they would have a reliable way to check the accuracy. The basic method depends on a suitable hash function or checksum algorithm. A hash function adds a fixed-length tag to a message, which enables receivers to verify the delivered message by recomputing the tag and comparing it with the one provided. See Wikipedia: Error detection and correction. I realize this is a very simplistic way of explaining a complex procedure, but it suffices for my illustration.

What is the genealogical checksum? We approximate this process by providing multiple sources. The more sources we can cite, the greater the probability that some of the information will be consistent across all the sources. From a genealogical standpoint, consistency is not the hobgoblin of small minds but the fundamental way we have to determine whether or not a particular historical fact is "correct." But there are a number of fatal flaws. Records are inherently unreliable. Simply citing a pile of them does not establish anything more than there are a pile of records.

Now, let's think about early records. If the main issue in correcting the inherent inaccuracy of historical records is redundancy, i.e. more than one record with the same or similar information, then the older the records the less reliable they are simply by virtue of the fact that the records begin to be less available. Fewer records = less accuracy. So early family lines are, by the very nature of historical records, far less accurate than later lines. If you choose the wrong "John Smith" in the 1900s you just may find other records that prove you have made the wrong choice, but if you choose the wrong "John Smith" in the 1300s, you may have only one choice and no other records to prove you either wrong or right. But the chances are you are wrong.

Can an experienced, competent, brilliant, researcher beat the game and extend a pedigree indefinitely? No. The competency, experience or brilliance of the researcher cannot cope with the lack of an error determination and correction methodology. You can never know if that detailed Medieval record you found is correct or not unless you can find some other record that confirms the first and so forth.

No comments:

Post a Comment