All four (Ancestry.com, FamilySearch.org, MyHeritage.com and Findmypast.com, or five, if we include Geneanet.com) of the big, online genealogy databases provide algorithmic, programmed record hints. However, most genealogists, especially those with advanced technical savvy, rely more upon manual searches than they do the automated ones. Of course, in order to use the record hints, the genealogist must have a substantial family tree in each of the five programs and in my experience, even genealogists with considerable research experience balk at the idea of maintaining four or five separate databases, even assuming that the researcher does not have a separate, off-line database. Since the record hinting software uses the information contained in a family tree, those without family trees in each program do not benefit from the advanced technology.
What is interesting to me is that even if the researchers have their family trees in one or more of the programs, they still focus on manual searching even when those searches are not overly productive. I guess my question is what makes the researchers think they can out perform the automated record hints with manual searches on the same program. Commonly, I observe researchers who are manually searching one of the large database programs when they have a long list of waiting record hints to process.
Before getting further into this issue, it seems to me to be a good idea to examine the limitations of both manual and automated searches. First the limitations they have in common, some of which are pretty obvious.
Limitations on both manual and automated searches:
- All database searches are limited to the content of the records in the database. This may seem elementary, but how many times do you check to make sure that the website has the type of record you are looking for before you make your search and of course, if the records are not in the database no automated process will make them appear. This could also be a controlling factor in choosing to put a family tree on a website.
- The accuracy of the initial assumptions made by the researcher about the identity of family members, i.e. if you input the wrong parents, the program will not know this and will look for what it is told to look for.
- The assumptions made by the algorithms programmed into the automated searches and the manual searches.
- The detail contained in the record sources. If the records contain little data about the target individuals, the searches will produce more false positives.
- The accuracy of the indexing done to the original records.
- A manual search is usually limited to the information known and input by the researcher.
- The searchers ability to formulate search terms. This is one of the primary issues.
- The time spent by the searcher.
- The number of false positives returned by the search. Many searchers give up when the search engine begins producing long lists of obviously inappropriate responses.
- The program must rely on the accuracy of the information the user enters into the target family tree.
- The depth of the information considered by the searches. This is one of the main factors limiting the first iterations of automated searches. Innovations in the past few years have increased the search depth and thereby increased the accuracy of the searches.
- The number of records included in the automated record hint program.
- The input from the user in the form of accepting or validating the record hints provided. In some cases, the user declines a valid record hint that will then inhibit the ability of the program to provide future hints.
- Inability to distinguish between very close, but wrong, matches.
I have written down some of the limitations, but what about advantages. From my own perspective, I find the record hints to be highly useful in accelerating my ability to fill in routine-type research. I regret all the hours I spent searching through U.S. Census records for example. Now, I get Census input from almost all of the programs. I think it is an amazing development and I am grateful for all the work that has gone into developing such a useful technology. I think that any researcher that doesn't take advantage of the technology is basically wasting their time.