Pages

Friday, February 1, 2013

The Invisible Content

DearMyrtle, in a valuable blog post entitled Feedback: FamilySearch's Potential, brought up an interesting point about FamilySearch.org that applies to many other online websites: indexes do not contain all of the genealogical information. An index is an extracted document. That means someone has to look at the original and transcribe what they see. If you have ever tried to find a family in the U.S. Census that seemed to disappear in one or more census years by using indexes, you probably ran into either the enumerator's error or the indexer's error. But the problem pointed out by DearMyrtle goes beyond that issue, it addresses the issue of those who provide the digitized documents giving the impression that the the only option available is to search the indexes. They fail to mention that there may be original documents that are not included in indexes and cannot be searched by a search engine. It should be obvious that the content of digitized documents cannot be searched unless they are indexed.

In the case of huge online genealogical databases such as those maintained by Ancestry.com, FamilySearch.org and many others, as DearMyrtle points out with FamilySearch.org, there is no mention in the searching mechanisms on the websites that the information you are seeking may either have been indexed incorrectly or not in the index at all but still perfectly available in the original sources. In the case of FamilySearch.org, only a small percentage of the total number of records available that have already been digitized are indexed. In the case of Ancestry.com, nearly all the records have been indexed, in fact, a high percentage of the records on Ancestry.com consist solely of indexes without links to images of the original records.

Now this is another issue added to the one raised by DearMyrtle. The database search engines may not be able to find the records in the database even if you give the correct parameters. So, in addition, to not pointing users to information that is not in the indexes online, the databases do not tell the users that their own search engines may not produce the information even when it is in their files. In other cases, the search engines may produce dozens to hundreds or even thousands or responses, none of which refer to your ancestor. I refer to this as producing false positives.

I could start giving examples from my own research in just the U.S. Census records alone where the images are all available online. In the case of records where only the index has made it to the online database and there is no way of consulting the original record's image, you and I may never know whether or not our information is simply invisible online but actually still in the record. Here is a simple example; search in a large database, such as Ancestry.com, for someone you already know has multiple records and look at the results.

When you do this, you are going to see two different issues come into play. First is the issue of the transcription of the indexes but second is the issue of the way the search engine used by the database either works or doesn't work.

I used my Great-grandfather, Henry Martin Tanner, as a test case. I currently have eight sources attached to his entry on my Ancestry.com family tree so I know that he is in their records. When I do a search for Henry Martin Tanner, born in 1852 in California, I get over 1,400 results. Out of the first twenty entries that come up with this relatively simple search, only one of them is actually my Great-grandfather. Now the message this gives to someone who does not know the system is that the records are not there. However, there is nothing in Ancestry.com to tell the researcher that false positives and a lack of response to the search are "normal" and to keep searching with different combinations of terms or go look at the original records that are available in digital copies. For example, search the U.S. Census pages images for the town where the ancestor lived page by page. My rule is to always assume the record is available and to keep looking with combinations of searches and looking at the original records in digital form.

I see DearMyrtle's point as raising a fundamental failure of the online databases to provide a pathway for less experienced users to find their family records. Not everything is available at the "push of a button."

1 comment:

  1. You & DearMyrtle are making wonderful points that lead to frustrating searches.

    ReplyDelete