Some people eat, sleep and chew gum, I do genealogy and write...

Sunday, December 2, 2018

Comments on The Future of Online Family Trees


My friend Tony Proctor wrote a thought-provoking post on his Parallax View blog entitled, "The Future of Online Family Trees." I suggest that you read the entire blog post. Although it takes a somewhat dystopian view of the future of online family trees, I think there are several issues raised that should be carefully considered by those who develop online genealogical family tree websites and those who use their services. However, my views do depart from those in the post in certain respects, especially with regard to the possible very positive future of online family trees.

As Tony points out, there have been many people who have written about the ills of online family trees, including my own efforts, but even considering past comments, this is a topic that has not really been thoroughly discussed or considered. My views on the subject diverge from those propounded by Tony, but it is only through exploring a diversity of views and then deciding a course of action that we can possibly alter the inevitability of a dismal future.

I would begin my own comments by noting that Tony's conclusions about online family trees fail to reflect the reality of the paper-based origin of those trees. The extant condition of online family trees is merely an extension of the pre-existing condition of the multitude of individually created genealogies. In fact, most of the original information on these family trees originated from the incorporation of the information from millions of paper family group sheets and pedigree charts. The failings of the online trees are merely the reflection of and a continuation of the condition of genealogical research since its inception. The main differences between the paper version and the digital versions now online are the visibility of the digital versions vs. the obscurity and privacy of the paper versions. All of the problems and challenges of the individual genealogies that Tony vocalizes have been present for more than a hundred years. If we looked at the issues he raises with the assumption that his observations were being made about paper genealogies, almost all the issues would still be evident.

Genealogical activity, in general, has a checkered past and truthfully, online family trees have not done much at all to improve the general situation. More genealogists should become familiar with the history of genealogy as a pursuit. I have only found one good book on the subject at least in English and it is not as well read or well known as it could be. See

Weil, François. Family Trees: A History of Genealogy in America. Cambridge, Massachusetts: Harvard University Press, 2013. http://dx.doi.org/10.4159/harvard.9780674076341.

If you do some superficial research and tell someone they are related to royalty, they will believe it without any substantial proof. That fact is at the heart of most of the ills of genealogy.

So what do electronic, digital, online family trees add to the issues of varying degrees of accuracy, historicity, and completeness? Not much really as it pertains to accuracy, historicity, and completeness. The problem of multiple copies of inaccurate information is not unique to digital copies. Long before the internet, for example, my relatives were copying verbatim errors made over 100 years ago in a published genealogy about the predecessors of the Tanner family. Those errors did migrate to the digital venue but now that there exists such a venue, there is a way to communicate to a greater number of people and in many cases, at least for those who are willing to look online, correct the errors of the past. Before online family trees existed, there was no way for me or anyone else to even know about the contents of those paper-based documents. If my cousin had an incorrect entry on his own family group record, how would I even know that fact and further, how could I communicate with him and help him correct the information? In most cases, I did not even know the cousin existed.

Tony does not seem to think that by having a collaborative environment that the product of the interaction between individuals can result in a greater degree of accuracy and completeness. Here is where our opinions diverge. After years of working on the FamilySearch.org Family Tree, for example, I have seen tremendous strides in the correction of vast amounts of data and the addition of blog post information and narratives and serious evaluations. All of these include write-ups of research and analysis of the problems. Granted, there is a long way to go, but what I am seeing is that the collaborative nature of the Family Tree has provided both a venue and a forum for the possible solution of the very problems so well pointed out and discussed in the blog post.

The comments made in the post seem to include wiki-based family trees in the general category of incomplete and inaccurate collaborative family trees. In the past, I have written a lot about the nature of wikis and whether or not they can produce accurate and complete information. To some extent, because of the success of Wikipedia and other wikis, I think this is a dead issue. There is still some considerable resistance to the use of wikis as a source in academic circles, but time has shown that the information contained in wikis does become more accurate and complete over time. So why should wiki-based online family trees be categorized as a "dead end" to incorporating accurate genealogical information? Granted, there needs to be a certain amount of review and moderation, but that comes with time and the whole genus of online family trees is only in its infancy.

In my opinion, Tony is correct in his assessment of the impact of DNA testing on genealogy. It is certain that DNA tests can assist in reinforcing accurate research but it is also true that absent accurate research, DNA testing a far more limited than it is portrayed in the advertisements and claims made by those who benefit monetarily from those same tests.

Right now, one of the greatest challenges is to enhance the ability of good data to be transferred between family trees and restrict the transmission of garbage. In effect, except for their visibility, I cannot really see any way to improve in the myriad of individual family trees online. In fact, I uniformly ignore them. But were there efficient ways to capture well-developed information, it would help to augment the need for extensive personal research.

One point made by the blog post that most heartedly agrees with is the need for a ranking function for all of the family trees. I have advocated for this for many years. It could be stars or numbers or whatever, but all of the entries in all of the online need some sort of ranking that indicates the degree of accuracy. Some websites do rank their opinion of the usefulness of data sources, but what is needed is a ranking system for individual entries. We have online ranking systems for almost everything from soup to nuts, so why not extend that to genealogical entries. For example, if we had something like the following:

Ranked this entry for
Completeness [up to five stars with one star being low]
Reliability [up to five stars with one star being low]
Accuracy [up to five stars with one star being low]
etc.

This would go a long way towards making family trees more reliable and at least give us a consensus of the reliability and completeness of the entries. We could also add a "review" such as "Tell us why you marked this entry with one star" etc. If we can use this system to evaluate purchases online, why not genealogical "purchases" from family trees also?

I am sure that the ranking system would make some people discouraged and hurt feelings and some commercial genealogy websites would probably not want their customers to find out that what they had online was junk and this might be a reason why it will never be implemented, but we can all use our own star system of evaluating what is and what is not a good online family tree. Maybe we could extend our star system to blog posts. In that case, I would be able to give Tony five stars and a good review and if I got enough one-star posts myself, I could finally justify quitting and stop compulsively writing almost every day of my life.

The national and international news has had a lot of references lately to "false news." Isn't the issue of copying false or inaccurate family tree information the same issue? Solve false news and you might be able to solve false family trees.

Tony writes as a part of The Family History Information Standards Organisation (FHISO). I have long been in favor of the goals of this organization. Family History, in general, certainly needs some standards and as applied to online and desktop family history websites and programs, the need is even greater. However, given the recent past, it is unlikely that such standards will be adopted anytime soon. Part of the original discussion involved the use of GEDCOM as a standard method for transferring data between genealogy programs, but the with the advent of the internet and online trees, that problem has escalated to unimaginable proportions. I currently find little, if any, interest in the issues and problems.

Thanks to Tony for bringing up the issue once again.

5 comments:

  1. James, excellent article. I am a regular follower of your blog. I also read Tony Proctor's article previously. I was especially pleased to see you promoting the idea of ranked assessments for individual genealogical claims. I attended a lecture where Judy G. Russell referred to these in one form as logical qualifiers. I write genealogy applications that make extensive use of these, which I refer to as reliability assessments and where they represent the likelihood that a particular claim is true based upon the referenced sources, and the properties of that source such as its type and its category. I assign reliability assessments automatically to EVERY claim and present them, in family trees, alongside each. Links to sources are also presented. My reliability assessments are defined as: "unsupported", "estimated", "unreliable", "uncertain", "proposed", "reported", "supported", "probable", "certain", "questionable", "proven" and "impossible" (users can change the actual display names). Visitors can use this information to better determine for themselves which claims are well supported by evidence. Using reliability assessments allows genealogists to also include negative evidence ("impossible") and multiple single-occurrence events, such as birth, death, etc. without having to draw conclusions. Conclusions can be left up to the visitor. Reliability assessments force genealogies to become evidence based. Claims without sources specifically state that they are "unsupported". In order for reliability assessments to be the most accurate, users do need to configure source types and/or categories. GEDCOM does not include this support out of the box, however I allow users to configure these via a configuration file so that they do not need to modify their GEDCOM file. I have been able to offer dozens of other new capabilities as well by allowing users to extend their GEDCOM file via configuration, including privacy flags, evidence models, GPS categories, defining DNA relatives, text replacement, advanced image handling, integrated blogging and embedded referencing to name just a few. When genealogists use applications (like mine) that promote sound genealogical evidence-based paradigms, and when more developers begin to add these features into their applications, I don't think there is any reason to think that the future of online family trees is in jeopardy.

    Tim Forsythe
    http://gigatrees.com

    ReplyDelete
    Replies
    1. What a great comment. Thanks for letting me know about your work. I will check out your website. If you come to RootsTech please look me up. I will most likely be either at the media hub or at The Family History Guide booth. Keep in touch.

      Delete
  2. With due respect, your statement, Dna is "far more limited than portrayed", is an indication you do not, and have not, done any segment matching, have never found an NPE, know little about the science of dna, and do not appreciate its advent in law enforcement. JMHO

    ReplyDelete
    Replies
    1. Dear Anonymous, DNA testing requires all of the procedures you outline and a well documented family tree to resolve genealogical problems such as end of line issues. Very few of those companies selling DNA kits instruct their purchasers about these extra procedures that are necessary. My comment is directed at the companies that sell DNA kits. There are some companies that are making significant progress in this regard, such as MyHeritage.

      Delete
  3. Enjoyed both articles about the future of online trees. I would love to see a rating system. Also want to give a shout out to Gigatrees. It really is quite a good program.

    ReplyDelete