Some people eat, sleep and chew gum, I do genealogy and write...

Tuesday, October 22, 2013

Dealing with the Deluge of Duplicates

During the past week, I have spent some very complicated hours dealing with duplicate individuals in various programs online. I have mentioned before that some of my ancestors have literally hundreds of copies of their basic data in dozens of programs on the Web. Some of the online programs, such as MyHeritage.com and Ancestry.com consider these "duplicates" to be an opportunity to share and collaborate with the other contributors who have similar information about the same individual ancestors in their own family trees. But in many cases, the information is similar enough to make a connection but differs from my own data in significant ways.

One reaction to this flood of duplicates is to simply ignore it. I must admit that this is what I have done in most cases. Right now, I have huge pool of Smart Matches pending on MyHeritage.com. I have another pool of matches on Ancestry.com. I have notifications on Geni.com that I have over 700 matches and the list of programs with multiple copies of my ancestors goes on and on. It would be interesting and perhaps productive to follow up with all these leads, but there is not time to do this.

From another direction, I helped patrons resolve issues with merging duplicates in FamilySearch.org's Family Tree program. Some of these were extremely complicated to resolve and took quite a bit of explanation to the patron, some of whom I am sure did not understand at all what I was trying to resolve.

Then there is the problem of duplicates in my own databases on my own programs on my own computer. For example, I have 10 individuals names "Sarah" with no further identification. Are any of these duplicates? I cannot tell because of the lack of specific information. I have eighteen individual ancestors identified only with the name "Mary." One of the Mary entries dates back to 1592. Yes, I am guilty of having the same kind of entries in my genealogy that I use all the time as bad examples.

The genealogical database programs all seem to have a provision to merge duplicates. But in almost every program, merging the duplicates does not eliminate all of them. Some duplicates seem to resist merger, even when they are specified and done manually.

I realize that this is not a new problem. As I went through the Family Group Records generated by my own family members in the past, I found dozens, if not hundreds, of duplicate records. It makes me envy those who live in a genealogical wilderness where there has been no research and they have an open field to discover and cultivate.

My current solution, aside from ignoring all the duplicates, is to focus on my own database and build a strategic base of operations with multiple sources for each individual. Although this is a slow and somewhat tedious process, it promises to resolve the duplicate issues in the end. If there ever is an end, which I doubt. As to the issue of conflicting data, that is the primary reason for the effort to establish a huge base of sources. If my research is sound and supportable, then the multiplicity of conflicting information will have no effect on my own information and I can proceed in extending the documentation as far back as I can go before I pass on to my ultimate reward. I have concluded that this is all I can do. Of course, I would like to connect with those in my family who have a similar goal, but except for my daughter, I have yet to find very many, if any, who share my goal.

2 comments:

  1. I had to smile at "It makes me envy those who live in a genealogical wilderness where there has been no research and they have an open field to discover and cultivate" - a case of, "Be careful what you wish for!"

    I find merging in FS FT to be tricky. The issue is not merging individuals, as such, but "merging" their family details. A large percentage of the duplicates I've tried to resolve so far, have been individuals automatically generated from baptisms in parish registers. I find, for example, baptism of child C to parents M and F, followed a couple of years later by baptism of child D to parents M2 and F2, where M2 is a similar name to M and F2 is a similar name to F.

    Let's say my previous research demonstrates these are the same couple. I can use FS FT to suggest duplicates and merge the details for F2 into F. As part of the same process, child D is moved over to have a father of F. The question is - what do you do with the mother, M2 during the merge of father F and F2?

    I have, for some time, always moved mother M2 over into father F's details as part of the process of merging F and F2. I did this because I feared that leaving M2 where she was, would delete her relationships and result in an M2 with absolutely no data other than her name.

    In just trying that option of leaving M2 where she is, I have discovered that, despite the merge screen appearing to "orphan" her from her spouse and children, she does actually go over to become "married" to F, with her child D. I have no idea whether this has always happened or not.

    BUT - and I've only tried it the once - if I then try to find duplicates for M, it will NOT suggest M2, and I have to merge the two by record-key.

    Whereas if I explicitly send M2 to be a spouse for F, then when I try to find duplicates for M, it DOES suggest M2 as a duplicate. (And this is my usual way of working..)

    I can't quite visualise what's going on. Which worries me because if the recipe goes wrong, I won't have any idea what to do - apart from everything in turn.

    ReplyDelete
    Replies
    1. I know exactly what you mean. I had the same experience two times this past week with similar merging issues. In both cases the patrons were so confused at the end that I could not explain what I had done.

      Delete