Tuesday, December 1, 2015

Buying Into the Revolution -- Part Five -- The Challenges and the Future

When I was much younger than I am today, the street in front of our house was a dirt road. Today, I drive on a freeway that has an 80 mph speed limit. The difference here is much more that simply the cement construction of the freeway compared to the dirt construction of the road, but the immense infrastructure difference between the two roads. On the dirt road, we might have half a dozen cars pass in a day. On the freeways that many cars and more might pass every second. Access to the dirt road was uncontrolled and took place at any convenient location. Access to a freeway is highly regulated and controlled. 

The analogy here is to the genealogical community. Many of us are still living on the genealogical equivalent of my dirt road. The difference is that at the time I lived on a dirt road there were no freeways in Arizona. By the way, the first freeway in America is known as the Arroyo Seco Parkway aka the Pasadena Freeway, between Pasadena and Los Angeles. It opened in 1940 and was extended in 1953. It is now considered a narrow, out-dated roadway. Having driven the Pasadena Freeway a number of times, I would have to agree. See Wikipedia: Arroyo Seco Parkway. Back to the analogy, not only are many of genealogists living on dirt roads, we are doing so without the benefit of indoor plumbing, running water and other modern amenities. 

Part of the problem is that the infrastructure available to much of the general online community is sadly lacking in the genealogical segment of that community. Here is what I see as the challenges. This is a summary of some of the things I have already covered in this series but I am drawing this series to a close and you can think of this list as a closing summary.

1. The fragmentation of the genealogical community and the lack of a consistent way to transfer data.

I like to look at this issue in terms of my freeway example. Freeways are designed to move large numbers of vehicles from one population center to another. Unfortunately, we have no such analogy in the genealogical community. We have the slow dirt roads level of information transfer but we have no common exit points. If I have carefully researched an ancestor and have details, sources, multi-media items, books and other records of his or her life, I am forced by every one of the existing programs to move my information from my own program to their program, one data field at a time. I could create and upload a GEDCOM file, but then I would risk losing the bulk of my information in the transfer process.

2. Source records are also fragmented and scattered across the world. Access to these records is time-consuming and difficult.

Although some parts of the genealogical community are now connected by data freeways, there are still an overwhelming number of documents that are not digitized, and many of those documents are also subject to substantial restrictions on any access at all. In my analogy, the freeway system is the internet. There is no question that many valuable genealogical documents have been digitized and are available on the internet, however, simply because the document has been put on the internet does not mean that there is an adequate way to access or even find that document. Presently, to do even adequate job of researching online, I must individually search dozens, even hundreds of different websites. Although Google is a vast resource, it cannot search in the closed catalogs where most of the genealogical records are kept.

3. Our present method of recording genealogical data is merely a paper substitute and does not take advantage of the connections provided by the Internet.

It may be true that the nature of genealogical data determines the format of the databases used to store and disseminate that information but I suggest that the format of the presently available genealogical database programs has failed to provide the connectivity and access to the Internet resources in a way that mirrors the degree of access available in other areas of interest.

4. The proliferation of individualized family trees has created a vast, impenetrable jungle.

Not only do we have a significant segment of the genealogical community that refuses to participate at all in the online world, but even those who have taken advantage of "putting their family tree online, for the most part, have created the equivalent of walled cities where exchange of information occurs only through very archaic pathways. If I happen to share a program with a person with a common ancestor and it is shown that we have a match, that match implies that both of us have duplicated all of the work necessary to discover that same ancestor. There is also a strong likelihood that all of the effort we expanded to make the connection had already been researched and recorded by someone else. This situation exists within a single program and does not take into account the fact that there may be dozens, perhaps hundreds, of other individuals doing exactly the same research to reach exactly the same ancestor on different programs, some of which I will never look at.

5. There is no free lunch.

This past week, I was traveling in Western Utah and was following a truck pulling a trailer with seven or eight ATVs. Each of those machines probably cost close to $10,000. If you add up the cost of the truck, the trailer and the ATVs the total would probably pay for my access to all of the online programs that I could possibly use for the rest of my life. The point here is that genealogy has a built in cost of time, effort and some monetary expenses. Compared to other common activities however, except for the time expenditure, genealogy is incredibly inexpensive. We have a tendency to focus on a very few, very large companies worth millions of dollars who are involved in providing online genealogical services. By and large genealogists are mostly benefited by incidental expenditures. For example, the United States National Archives was not specifically created for genealogists. But genealogists are benefited by the records preserved. Compared to a popular sport the amount of money invested in specific genealogical activities is infinitesimal. One example is that the production budget for the most popular movie ever made, Avatar, is approximately $425 million and the income from that one move is well over $1.7 billion. Other popular movies have similar incomes in the hundreds of millions of dollars. I suspect that the top twenty movies of all time have made more money than all the genealogical programs in the world combined. If genealogy is going to have the benefit of a freeway class information structure, there will have to be a dramatic increase in the amount of money invested by the companies as well as the consumers of genealogical services.

 The future

 From time to time, Mike made his comments about my ideas of the future genealogy. Coming to the end of another year, it is time to reassess those predictions. Some of the things that will occur are merely the product of programs that have already been implemented. The amount of digitized records that will become available will increase dramatically. However, I failed to see a concomitant increase in the ability of the programs to utilize that increase.  There have been significant breakthroughs in the ability of a limited number of online programs to find pertinent records automatically. These automatic search technologies will continue to improve, however any impact coming from this improvement will be limited to those who take advantage of programs. As an example, has had a significantly useful record matching technology for some time. However very very few of those with family trees on take advantage of the availability of the source suggestions.

By its nature, genealogy is a solitary pursuit. There are several programs including the Family Tree that are engendering a greater degree of collaboration than has heretofore been available. FamilySearch's efforts to expand the Family Tree will have a significant impact on the entire online genealogical community. Despite its present limitations, it has the capacity to become the "gold standard" for genealogical data. It is also in a position to measurably assist in overcoming many of the issues I have outlined above. It is not perfect, but the current level of usage will increase and the obvious, major problems with the data will slowly disappear. It will reach the point where anyone who wishes to see the status of research on any ancestral line will, of necessity, have to consult the Family Tree for current information.

It may take a considerable period of time (or not, depending on advances in technology) but the backlog of microfilms and un-digitized documents will begin to decrease dramatically. The next big hurdle is overcoming the fragmentation of all of the online digital collections. In this regard progress is being made by organizations such as the Digital Public Library of America (DPLA) and others to consolidate and centralize access to the now isolated databases. This trend will continue and the amount of centralized information will allow for far more efficient online research. The DPLA has a long way to go before it reaches the critical mass of becoming a core source for genealogy, but it is moving rapidly in that direction.

I am not making any predictions about the future of copyright law, legal restrictions or privacy issues. These will always be severe limiting factors affecting the free access of genealogical data.

I don't ever expect to wake up and find that all the family trees online suddenly become sourcecentric but I do see a movement to greater integration of the data. One giant obstacle of globalization of the genealogical programs is that fact that the developers do not see markets in non-European sectors. Any effort in developing a truly workable solution to the way genealogical data is handled is only going to occur when someone or some entity sees the need enough to invest in the development of a truly global solution for the need to store, access and transfer genealogical data.

