Saturday, May 30, 2015

Comments on Becoming an Excellent Genealogist -- Chapter Sixteen

This is an ongoing series of chapter by chapter comments on the book,

Meyerink, Kory L., Tristan Tolman, and Linda K. Gulbrandsen. Becoming an Excellent Genealogist: Essays on Professional Research Skills. [Salt Lake City, Utah]: ICAPGen, 2012.

I am now commenting on Chapter 16: "Getting the Most from Electronic Indexes" by Suzanne Russo Adams, MA, AG.

 I fully realize that I have stretched this particular commentary over an extended period of time and it has been a while since I wrote my last post on the subject. However, I intend to finish the entire project. Chapter 16 is one of the few chapters that I entirely agree with. This chapter outlines some of the concerns that all genealogists face with using online websites and particularly in using indexes created by those websites. This is not a commentary on the accuracy of the FamilySearch Indexing program or any other program involving the complex task of indexing genealogical records, but it is a cautionary commentary on the reality of the limitations involved in those indexes.

I think the following paragraph from the article gives an overall view of the issues involved.
A history of the record sets you might find on the web, such as county or state vital record indexes and indexes to the census, might require a little genealogical sleuthing about where the records came from and how they were created, in order to better understand and utilize them.
 I would suggest that more than a little genealogical investigation as to the origin and content of the index is absolutely necessary. One example from is sufficient to illustrate this problem. If you examine the Historical Record Collections on carefully, you will find that there is an extreme discrepancy in some cases between the number of records reported and the number of records digitized. Although it is not spelled out directly, the record count given for each of the records, apparently reflects only the number of indexed records and not the number of total records in the collection. For example, a recently added collection is entitled "Czech Republic Church Books, 1552-1963." This particular collection is listed with 86,069 records. However, further investigation indicates that there are in fact 4,668,489 images. Obviously, the number of records actually indexed is only a small percentage of the total number of records in the collection.

Using this example, any search on of this particular collection will in effect, only be searching a very small portion of the total number of records available. Until the records are completely indexed the only safe way to search this particular collection is to examine the records as if they were still microfilmed, as they actually are. In other words, the researcher must search the records individually. Fortunately, because the microfilm has been digitized it is readily available and searchable online for free. Those who are unaware of the situation and rely on the fact that there is a number of records indicated, will likely miss information. Unfortunately, FamilySearch does not tell the researcher which collections are only partially indexed in an obvious fashion.

This, and many other problems and challenges of using indexes are carefully outlined in the current chapter under consideration. Another quote indicates clearly that the author understands the limitations of the indexing process.
Due to continuing technological advances, it is becoming increasingly easier to connect a scan of an original source document to an index. When viewing indexes online, it is important to learn what information is contained on the original document and the "rules" for collecting that information. For example, enumerators of the U.S. federal censuses were instructed to record the state and/or country of birth for an individual. Thus, researchers should not expect to find the city of birth of an individual listed on an original census record or on an electronic federal census index. Moreover, researchers must understand which fields of information were indexed on a specific document in order to effectively use the index.
The article continues with a summary of the various areas where the index may be limited due to the selection of the fields indexed and the information contained in those fields.

 The article also contains a short comment on the limitations of OCR transcribing. Although optical character recognition (OCR) has been being developed for some considerable period of time, there are still some obvious limitations. Some of those limitations are pointed out by the author and include the quality of the original image. One significant limitation of OCR transcriptions for genealogists is that any document that is handwritten is not yet able to be electronically transcribed except in very limited situations.

This particular chapter is an excellent example of the value of this particular book.

