RootsTech 2015

Some people eat, sleep and chew gum, I do genealogy and write...

Sunday, August 19, 2012

Is Merging Better than Combining? Comments on FamilySearch Family Tree

A basic concept in the New.FamilySearch.org (NFS) program is the preservation of all of the iterations of the data about every individual. The original data for the program came from a variety of sources with duplicate information. In order to try and contain the duplicates, the NFS program has a method of combining duplicate information into sort-of a super individual package. Even after the duplicates were combined, you could see all of the variations in the content and there was a way to "un-combine" individuals and restore them to their previous uncombined state.

The results was that many individuals recorded in NFS had hundreds of sometimes contradictory components. Unfortunately, there was no practical way to mark the "correct" information. Blatantly incorrect information was equally displayed with well documented correct information. The reason this occurred was likely a fear that unless all of the information was preserved, possible correct information may be lost. In addition, nothing could be changed. The only way to "correct" bad information was to add more.

The practical results of having all of the information visible and unable to be corrected was that some ancestral lines were entirely unusable. There was no way, practical or otherwise, to select the correct information from the huge information cloud and follow the most accurate pedigree line. As time went on, the information in the file grew even more muddled and complex. Families with huge numbers of duplicate individuals essentially could not use the program at all. At the same time, new users of the program, whose families had little or no information in the program, had a very positive experience. Because they had no duplicates, the program worked for them and did what it was designed to do.

At some point, the concept of preserving all of the data, contradictory or otherwise, was rejected by FamilySearch and they began developing the FamilySearch.org Family Tree program. Family Tree uses the concept of merging information rather than combining. In a merge, conflicting information is essentially hidden from the view of the user. The results is that the only information visible in the program is the consensus "correct" information. So if two separately recorded individuals who are really the same person are combined, all of the information from both individuals is available and visible for the now combined individuals. If the information were "merged" then the rejected information would disappear and the only information visible would be that selected as correct. In reality, the merged information does not go away, it is merely hidden and could be restored at any time.

Arguments can be made for both methods of preserving the information. But in the second example, as is supposed to eventually be the case with Family Tree, there is an inherent method for determining the most correct information and substantiating it through supporting sources. That means that hopefully, bad information will not continue to perpetuated off into the future and a more accurate pedigree can be established for each individual.

2 comments:

  1. James, please see a thread in the getsatisfaction.com forum: http://tinyurl.com/8w3mqrj

    ReplyDelete
  2. Just spent an hour going through the Person Merge process with a FamilySearch guy. I recommended that they have an "Add" button to add new information, the "Replace" button to replace consensus info, and the "Reject" button to keep the consensus info. also to keep the "Reason to Merge" form and the "Confirm Merge" button ast the end of the process. The screens were pretty logical, with the "different" information on Person 2 in a box.

    Also complained about sources not being tied to Facts - in the ideal case the Fact is linked to the source. If not that, I suggested some sort of indicator that a source is provided in the source list.

    Interesting hour...it helped that I had done some work in NFS and FSFT and using RM5 already!

    ReplyDelete