Some people eat, sleep and chew gum, I do genealogy and write...

Tuesday, July 12, 2022

Can we preserve the integrity of the FamilySearch.org Family Tree?

 

The FamilySearch.org Family Tree is a marvelous resource. It has the potential to become a standard universal wiki-based family tree. However, there are serious issues with any large wiki-based project that must be addressed to maintain informational integrity. There is a precarious balance between open participation and accuracy. Most large, online cooperative websites, such as Wikipedia, are source centric. In addition, However, Wikipedia actively notifies users when information is unsupported by sources or is incomplete. Here is a quote from the article, Wikipedia: Quality Control

Quality control is essential to Wikipedia. To maintain articles of acceptable quality, it is necessary to improve the quality of existing material, and remove material of irreparably poor quality.

The article further points out the following:

But mistakes sometimes occur. These, and the damage done by the bad apples mentioned above, need continuous attention. The three ways that Wikipedia maintains its quality control is as follows: (a) A great deal of Wikipedia's volunteers' effort is applied to quality control. Wikipedia has an elaborate disciplinary system for handling vandals and other troublemakers, and a dedicated force of system administrators to enforce the Wikipedia community's decisions and policies – admins even have the power to block a bad apple permanently. (b) Once material is added to Wikipedia, an army of volunteers organized under various departments check and recheck it to make sure it conforms to the high standards set forth in Wikipedia's policies and guidelines (which were established specifically with the creation of quality articles in mind). There are departments for everything from typos to factual errors. For a list, see Wikipedia:Maintenance. (c) And Wikipedia even has robots, automated users that monitor for errors and correct them automatically. For example, these days most vandalism is fixed by Wikipedia's robots, or our content editors, who are watching your every move. Be careful. 

Now let's think about the FamilySearch Family Tree. Which of the integrity preserving activities outlined by Wikipedia have been implemented for the Family Tree? 

Basically, the Family Tree is an unsupervised, open field for any type of change and any level of accuracy imaginable. The only moderating factor is that some users can "follow" entries in the Family Tree and make corrections. No one and no organization monitors the followers. Good, substantiated information providing birth, marriage, and death events are frequently changed by unsupported and speculative information. For an example see "My Revolving Door Ancestor: Francis Cooke.

There seems to be a dichotomy in the Family Tree between those who recognize the need to maintain the integrity of the Family Tree and those who are afraid the "average user" will be excluded by any such efforts. For example, it has been proposed many times that some restrictions be imposed on entering unsourced information. This suggestion always incurs the response that any restriction on entering any data in Family Tree would discourage people from entering information that they know about their immediate family members. For example, the question arises about entering information derived from oral histories. Some efforts in standardizing information have already been implemented, such as standardizing dates and places, but let me give give some hypothetical examples of how a system might work that would differentiate between initial entries and changes. 

Let's suppose I was a completely new and unknowledgeable person who wished to begin entering information about my deceased parents and siblings. Let's further suppose that I only have a limited amount of information about my family. The Family Tree should, of course, allow me to enter what I know about my own family. However, why not put that information into a specialized category. These initial entries, where there are no sources, could be shown to be lacking in sources, the same way date and place entries are shown to be non-standard. This would encourage the entry of sources without discouraging entries altogether. Additionally, personal knowledge should be considered to be a source. 

Now let's suppose another situation. Let's suppose I have been doing genealogical research for years and have contributed thousands of source supported entries. The existing system in the Family Tree will list all the pertinent sources supporting a Vitals section entry if someone tries to change any of the entries. An attempt to change existing sourced information indicates how many people are following the changed individual and how many have contributed information to the individual. Why are these entries not treated differently that those made by novice users? Why can anyone, without supplying a source or a reason make changes to existing entries when sources are listed? For example, if you refer to the entries for Francis Cooke LZ2F-MM7, referenced above, and view the change log, you will see that the change log contains hundreds, if not thousands of changes including adding parents without citing a source to support a child/parent relationship with the person added. Recently, for another example, someone added Richard Cooke (b. before 1530 - d. 3 October 1579) the son of Anthony Cooke (b. 1504 - d. 1576) as the father of Francis Cooke LZ2F-MM7. There was no source cited showing a connection between Richard and Francis. It was also apparent from the entries already in the Family Tree that Francis was born about 1583, after both Richard Cooke and his father are listed as dying. Why does the Family Tree allows such a change to be made and thereby force someone following Francis Cooke to correct the mistake? 

From my experience with the Family Tree, I see thousands of changes being made to people without a modicum of an attempt to provide a source for a parent/child relationship or for any other changes made. This includes changes such as the one for Francis Cooke where simply looking at the dates would obviate the need for a change.  Adding and accurately correcting information should not be considered a change but in the case of people such as Francis Cooke LZ2F-MM7, no changes should be allowed at all. Now, FamilySearch can and does make some entries Read Only. However, other than protecting a narrow arbitrary class of people from changes, read only entries are parsimoniously distributed. Why not monitor the number of changes made to one individual, such as Francis Cooke, and make him or her read only if a certain number of changes were made per week or month? In the last week there have been 24 changes, including corrections, made to Francis Cooke LZ2F-MM7. Why is this acceptable? 

The main effect of allowing wholesale changes to the Family Tree without discriminating about whether or not they are warranted is to discourage a significant number of people from using or being involved in the Family Tree at all and also forcing people who could be productively working on adding substantiated information to spend an inordinate amount of time correcting unnecessary changes. How about having a possible "Verified" status that would prevent unsourced information from being added to those individuals. 

I need to mention that adding a source is not a change. Correcting entries that need correcting is also not change. How do we know the difference between a change and a contribution? Presently, there are almost no internal systems for the Family Tree that prevent or even slow down constant and monumental numbers of unsupported changes. If nothing is done to prevent the escalation of these changes the Family Tree will become almost completely unreliable and far less useful than it could become. 

This is only the beginning of my discussion about this subject. Here is a list of some of the previous blog posts about this same subject. 

7 comments:

  1. I think my big issue is that it's a really cumbersome site to use at times, not least due to the huge number of duplicate entries for individuals. My great grandfather is one of several chaps with the same name born in the same place in a short span of time, he married 3 times and is referenced in multiple census, baptism and obviously marriage records either for himself or his children. The end result is two dozen duplicates often attached to the families of the other chaps bearing his name. I spent several evenings trying to sort through everything and add sources but found it so cumbersome I couldn't face it again with the others. When everything is already known and sourced it shouldn't be such a huge task and require multiple tabs open everywhere, I'm not an IT expert but I get by, perhaps I'm using the wrong techniques to edit things but I find it a very difficult exercise, I've just revisited the site and found dozens more duplicates again after someone mixed up the various family groups.

    ReplyDelete
  2. In short? No.

    Whilst the primary driver for FSFT is name collectors racking up massive numbers ordinances it will never be something with any integrity to preserve overall. Since those name collectors are backed up by significant sections of the religious hierarchy which runs the site they will always be allowed to have free reign to the detriment of the site.

    I had a message from someone recently who had added a child to a family that I had added to the site. As it happened I added that family as part of collateral efforts to expand coverage and so far as I am aware they are not direct relations. The addition was also correct from a relationship point of view. In other words the son added did appear to be related to the parents.

    However the quality of the addition was otherwise appallingly bad. No sources were attached despite a correct auto-suggested source backing up the christening. Instead of the christening being added as a christening it was conflated as being the same as the birth. A place of birth was assumed with no direct evidence. A marriage was picked out, with some backing, but still of potentially dubious provenance. The marriage did share some witness names with potentially related people, but it was tens of miles away from other events with no suggestion of why the bride and groom might both have gone to that parish and then returned back to the groom's home parish. There was also no effort at all made to do a cursory search which would have revealed multiple census records to actually back up the assumed place of birth. The wife also had Unknown explicitly entered for place and year of birth.

    The person then had the lack of awareness to say, "I have been doing family history for over 30 years so I should know how to do family history." My reply was that yes they should know how do to it but clearly did not actually know. I also laid out why the entry was of very poor quality and what should have been done when creating it. Afterwards I went and added the source information that should have been there from the start plus adding four children of the marriage from census and christening records. I also removed the maiden name of the wife as unsourced and therefore unproven and got rid of the Unknown year and place of birth, especially since one of the census sources I attached provided year and place of birth.

    Now this person isn't a name collector so far as I am aware, but is emblematic of the problem with a huge number of users of the site: extremely sloppy additions with no proper supporting evidence.

    Poor quality additions like this are one of the lesser problems: at least the familial relationship was correct, if unsourced! If that's one of the lesser problems it shows the mountain to be climbed to actually get proper integrity in place for the system. There is no incentive for the religious hierarchy to actually put such integrity in place, and they demonstrated their own, true colours with their recent deliberate vandalism of the places database when they removed US territories from the database.

    If those in charge of FSFT are willing to vandalise and degrade one of the key support databases for FSFT in order to pander to stupid and/or lazy users of the system then why would they be willing to take steps to actually improve its quality and integrity?

    ReplyDelete
  3. Alan Forbes FlemingJuly 12, 2022 at 8:09 PM

    Statement that is easily viewed and accessed by followers. Sources and Citations will need to be included and meet an established standard. Documents need to tied to the Sources more rigorously. Oral stories and genealogies need to be verified by another witness. MyHeritage and RootsMagic both have a Consistency Checker to identify likely errors and inconsistencies for the contributor to review. These are great features. They highlight possible errors or needed explanations, eg why a girl was married at 14 or 15 years old. FS should include this prominent feature as well. Of course, contributor collaboration and groups should be encouraged and enabled. But the key point is, and this is my biggest bug, is people connecting someone to the Tree because the Name/Surname is the same, especially if the Name is common for the area and time. There are more than one John Smith, age xx, leaving in Smallville during 1881. In fact, I have had 2x John Leishman's born the same year in the same town of Stirling. They were cousins. They both married a Susan, only 2 years apart. Figuring out which one married which Susan caused some frustration. Then writing a GPS or Reason Statement to convince other contributors which was the right match for our family tree. Teaching and requiring a higher standard of GPS is required to maintain the integrity of the One Global Tree. I love the website 'The Family History Guide' (TFHG) which is the official training partner for FS, and feel that a FS Accreditation Course, which includes GPS, should be required for all Family History Centre Volunteers who run the FS FHC. Those volunteers should gain a 'Certificate' and Courses run locally or virtual by an Accredited TFHG Trainer in each geographical region (to allow for understanding of local or specific country sources). Accredited TFHG volunteers at the FS FHC would help to ensure patrons are taught the requirements of a GPS and that their Tree complies. It help to raise the standard of the FHC and that all patrons would view it as a place for reliable help. It would raise patronage.
    For LDS members it will also ensure less errors in ordinance work. Don't want to seal g-grandad to the wrong wife!! He won't be happy.

    ReplyDelete
  4. Alan Forbes FlemingJuly 12, 2022 at 8:09 PM

    This comment has been removed by a blog administrator.

    ReplyDelete
  5. You make some excellent points, as always.

    As a very heavy user of the FamilySearch Family Tree, I think it needs to eventually move to a three-category system of profile management:

    First Category (Locked Profiles): Reserved for the very small number of famous/infamous/controversial figures where a total lock on the profile is the only way to deter the vandals.

    Second Category (Managed Profiles): Reserved for the small percentage of profiles where there are ongoing problems, as evidenced by the chaos in the change logs. Volunteers would have to form groups in order to manage profiles, with at least three per profile. Any changes would need to be approved by at least two managers. This would require work to set up, with potential volunteers having to demonstrate to FS or other volunteers that they have the necessary skills (and access to records) to manage a certain type of profile (e.g., Early New England settlers, or Australian Convicts).

    Third Category (Open Edit Profiles): The vast majority of profiles would remain in this category. There would have to be clear justification for a profile to be moved to category one or two. Frustration over a bad merge or having to detach some sources which have been incorrectly attributed to an individual is definitely not enough. I certainly agree with you that the open edit system will result in an increase in accuracy of the Tree over time. This is the real strength of the Family Tree.

    I work almost exclusively on English and Australian families, and have never encountered a profile that would require active management, but I can certainly understand the frustration of those who have to constantly deal with this problem.

    Thanks, James, for the work you are doing in drawing attention to this important issue.

    ReplyDelete
  6. I whole heartedly agree. Thank you for your efforts and dedication.

    ReplyDelete
  7. I have worked on my part of the family tree for over fifty years. As changes have taken place with the tree I have worked hard to update the information. The 2012 changes were profound. Recently I have seen people making changes to the tree or adding people who are not really a part of that family. It is explained that they are doing so as a part of an assignment from an instructor. They are in no way related to the family for whom the changes are being made and only based their changes on information from an indiscriminate source. After discussion they are willing to make a correction to the change that they made. There is no control that would require a review of the validity of the changes being made. I think that you are clearly providing ways that these problems could be remedied. Thank you for this excellent post.

    ReplyDelete