Some people eat, sleep and chew gum, I do genealogy and write...

Saturday, February 9, 2019

Will Computer Programs Replace Genealogists?

With a combination of record hints, DNA tests and optical character recognition could a computer program construct a verified family tree? For many people living today, I believe this question has long been answered. Some time ago, MyHeritage.com introduced notes "Instant Discoveries" and I have seen the program time after time produce a valid and sourced, albeit limited, family tree for users. Given the research skills of the majority of people constructing family trees on the large online websites, it is only a matter of time before the online websites implement programming that automatically verifies and corrects entries.

Obviously, the main obstacle for the complete replacement of the genealogists lies in the implementation of optical character recognition software that can read handwriting, i.e. handwriting recognition software. With the addition of artificial intelligence, the need for human intervention would be minimized.

FamilySearch.org and MyHeritage.com have both implemented error correction technology that now advises users of inconsistencies in their existing family tree data. The step that is missing is to make such programming intervene in the entry process and correct the existing entries. An intermediate step would be to mark questionable entries this has already been implemented to a small degree on FamilySearch.org.

Many genealogists already rely on computer programs that analyze DNA tests which are now being used as a matter of course to resolve criminal actions in the courts. So let's suppose this scenario. You take a DNA test from a large online genealogy based website with sophisticated programming. Your DNA test is then matched to millions of other users of the same website. The program behind the website then constructs a potential family tree relationship between the users. Meanwhile, the program has been analyzing billions of written records. The information from those records is then integrated into the structure created by the DNA testing. Hypothetical relationships are then analyzed comparing the DNA testing results to the information contained in the written records. As is already possible, the program compiles a list of "sources" and determines the relative possibility of relationships based on the accumulated sources and the DNA testing results. Previously submitted family trees are further analyzed to see if they reinforce or detract from the constructed family tree data.

As this hypothetical system becomes more and more sophisticated, it will become at least as accurate as any present genealogical analysis. Missing parts of this constructed family tree will be noted due to missing source information, i.e. documents that support conclusions. How will this be any different than what we have today? It will be a lot more accurate.

What are the implications of this scenario? What were the implications of the Internet when it was first introduced? Will I still have a job in the future? Considering my age I doubt that this will happen before I end up in the care center or pass on to my reward. But for many people living today, this is not a merely theoretical possibility. Maybe worrying about whether or not you can get younger people involved in genealogical research does the wrong thing to worry about. Maybe we should worry about the implications of creating a worldwide family tree showing that we really are all related. However, when you think about the possibility of worldwide family tree the reality is that records do not exist to connect every person to every other person on the face of the earth. The genetically, we are certainly all related.

4 comments:

  1. FamilySearch Family Tree is the largest current collaborative tree. Is it 100% accurate? My guess is that only about 1% of the profiles are pretty good - including yours and mine :) WikiTree, Geni.com, OneGreatFamily, WeRelate and perhaps others have fewer entries in their collaborative tree. All of them can be used for clues.

    MyHeritage said they were building the "Theory of Family Relativity" collaborative tree so they can help DNA matches find common ancestors. Ancestry has a Big tree that they used for their relationship study, the We're Related app, and the DNA Circles. Perhaps they will use it for more DNA effects.

    To date, almost everything in these collaborative trees has been created by individuals using family lore, family records, government records, published works, unpublished works, archive records, etc. Someone concluded parents, vital dates and places, and more for each person in the tree.

    There are over 7 billion living persons at this time, and there may be records for 20% of them in a government record, most of which are unavailable for a number of years to anyone but the person. FSFT has almost 1 billion persons in the tree, but it is rife with duplicate profiles.

    Remember Jay Verkler's vision in 2012 - that in 2060, a person could access a computer screen and see their entire family tree? Some people perhaps - in the USA, in western Europe, but everyone? I am doubtful. But I'll keep working on it!

    ReplyDelete
    Replies
    1. You are absolutely correct. Present family tree are flawed but right now if MyHeritage, for example, was to start visibly marking the entries in everyone's family tree with the Consistency Checker results, the errors would be abundantly obvious. Basically, what I am saying is that many of the activities we think need personal research will become automated. Think of all the people on social networks with smartphones. What if all that data were incorporated into family trees?

      Delete
    2. Re: Jay Verler's vision, this will probably never be the case. There is no fundamental difference between genealogical investigation and criminal investigation -- they're both hard, and may well never find resolution -- but to assume that answers to everything can be found directly in records of any sort is naive. Until genealogy accommodates "collaborative thinking", where we build gradually on investigative work of each other, as opposed to expecting independent conclusions to all fit together like a jigsaw, then genealogy will forever be considered a hobby with no academic rigor or merit.

      Delete
    3. I agree with the need for collaborative thinking. Right now, that is the biggest obstacle to resolving the online "family tree" issue.

      Delete