Pages

Monday, June 22, 2020

Comments on the Same Name = Same Person Problem



Using the Findmypast.com website, I can search for the frequency of any person's name in all of the website's millions of records. I can also filter the names by area and time. For example, if I were looking for an ancestor named "John Robertson," I could find the relative frequency of the name "Bryant" in a particular place in England in 1820. Here is a summary of the steps of the search.
  • Initially, without any filters, the search engine finds 1,716,613 instances of the name "Bryant" in all of the website's records.
  • If I add a filter for "England," the number drops to 603,078 results. 
  • When I add a more specific place such as the county of Kent, the number drops to 33,431 results.
  • Now, if I add in the date of 1820 plus or minus 2 years as a birth year, I find the number has dropped again to 6,482.
At this point, you can see that without more specific information, the number of possible duplicate names even in a small time period and a smaller geographic area. Now, what if I happen to know a little bit more information? The issue here, of course, is a "chicken and the egg" problem, If I know all the information why am I looking, and if I don't know all of the information how do I look?

But forging ahead, I add in a given name of "John," a fairly common English name. This results in a total of 986 people. Think about this result. If I am looking for an ancestor named "John Bryant" who was born about 1820 and lived in Kent, England, I have nearly a thousand records with different people with that same name. 

Now think some more. How many entries in online family trees do you see with this sort of information:

John Bryant, b. about 1820, Kent, England?

In this case, this person has about 1 chance in almost a thousand of being the correct person. 

Many times these names that we see in the online family trees are merely place holders. They may or may not exist, but by adding in the name we are making a guess that has a fairly high probability of being wrong. This probability of being wrong increases dramatically if you don't use good sense and start adding in people in England from other counties. For example, if I remove the designation of "Kent" as the county in my search, the number of results jumps to 16,824. Of course, guessing that someone with this same name could be found in a county far from where the family originated raises the chances you are wrong to almost a certainty.

What happens if I add a more specific geographic location such as Rolvenden, Kent, England? I get 43 results and the chances that I find the right person have just increased substantially. It is also possible that I can find unknown relatives in this small area.

The real key to these examples is knowing the specific geographic location of an event in an ancestor's life. The more focused you are on the geographic location of events, the more accurate you will be in finding people you are actually related to. 

No comments:

Post a Comment