Some people eat, sleep and chew gum, I do genealogy and write...

Thursday, May 12, 2016

What would it take to speed up the genealogical workflow?

If you think about it genealogists are in the business of producing pedigree charts and family group records. So you can analyze the process of producing their "product" those forms filled with names, dates, places and supporting documentation, as discrete steps in a manufacturing process. The major elements of that process are as follows.
  1. Explorations of raw resources consisting of identifying, locating and perusing historical documents. 
  2. Processing the raw material of documents of all kinds to extract the valuable information.
  3. Evaluating and organizing the developed information into a coherent product by entering the information into a standardized format. 
  4. Distribution of the final product, pedigrees and family group records, to the intended consumers, i.e. family members and relatives.
Over the years the genealogy industry has developed a complex system of mining the raw material in the form of documents and records. Several large suppliers have emerged in the industry by unfiltered gathering of huge warehouse of unprocessed documents and records and then packaging the raw material in a convenient digital format leaving the genealogists to select their particular raw material from these vast warehouses. As the industry has developed, the genealogists have partially adapted to new, streamlined production methods. One major development is that the raw historical documents are being delivered directly to the genealogists and in some cases, those historical documents are being partially processed by the suppliers and called record hints. This pre-processing activity has resulted in a bulk surplus of such documents usually containing a huge amount of unwanted and unneeded material. The packaging and large quantities of the products makes it difficult for the genealogists to determine the value of the material supplied.

Computerization of the genealogical process has produced some benefits in increased efficiency in the system, but at the cost of mass producing and flooding the market with low quality consumer produced knock-offs of the custom designed originals in the form of online family trees.

Not surprisingly the genealogist needs to examine his or her own personal workflow to maximize the benefit from the increased availability of historical records but at the same time minimize the detrimental effects of being buried in mountains of data. 

Where are the bottlenecks in this supply system analysis? We are being practically buried in record hints from the big four genealogy companies, but right now, we have no practical way to move the data between the companies. From the standpoint of an individual genealogist, there is no practical way to take advantage of all that pre-processing, everything still has to be done one name or field at a time. In addition the complexity of the online noise level has risen to the point where it is actually easier to focus on a traditional microfilm than try to ferret out your ancestral line from all of the alternatives offered online. For example, one of my third Great-grandfathers was born in England, moved to Australia and died there. Notwithstanding the fact that I have his death information in Australia in my family trees, I am continually getting record hints for people with the same name who died in England. These false positives are "noise." I have to filter out this noise to see if there is any valuable information being offered.

This is just one example of the bottlenecks that limit genealogical productivity. Another major bottleneck is attaching media files to an individual, i.e. photos, documents, etc. In every one of the big four, this must be done one item at a time. No matter how many documents you have about the same person, each one has to be handled separately. This also means that documents with more than one individual have to be individually attached to each person.

Neither of these issues come close to the noise level present from all the online family trees. There really has to be a way to rate the confidence level of individuals in these trees. I can do that when I go to Amazon and see competing products, why can't we give one star to those poorly substantiated individuals showing up as our supposed ancestors?

Perhaps you have some of your own ideas about the bottlenecks in the present genealogy system.


  1. This is one of the major reasons I stick just with Family Tree. I don't have enough time or brain power to have my pedigree in several places. I know I may be missing out on some unique sources, but then maybe my children or grandchildren will have something to add after I reap all the hints, etc. from FT. I'll let them focus on "jumping the puddle". And by then, there will probably be so many more wonderful tech enhancements, great things will happen. Maybe even the Lord will come down and just extend his finger and fix all the errors. And angels will come down with records no one else has for the parts of the pedigree that seem impenetrable. (that's been prophesied.

  2. Who would rate the confidence level of these individuals for their submitted trees? I feel that until the developers really focus on source centrix software, apps or anything Genealogical online then we not going to get very far no matter what rating is given. I will only feel confident when I do my own due diligence on my tree members. I deal with all the noise by focusing on my own research, doing my own analysis on the source before me. By attaching hints to my To-Do list (so I dont' follow BSO's down rabbit holes so much). Yes, I know what you mean it does get tiresome of reviewing hints. There are some developers such as and MyHeritage that put ratings or confidence levels on their source hints which can help but the challenge will still be there to question whether or not to look at it anyway even if the rating is low.