Some people eat, sleep and chew gum, I do genealogy and write...

Tuesday, November 16, 2010

Understanding the controversy surrounding GEDCOM

GEDCOM is an acronym for Genealogical Data Communication. GEDCOM is a text based standard developed by The Church of Jesus Christ of Latter-day Saints (LDS) for facilitating the exchange of data between different proprietary genealogical software programs. Its development was similar to that used by the U.S. Library of Congress in its MARC (Machine Readable Catalog) standard for bibliographic information. The approach was to define and implement a standard and then publish it for others to follow if it was in their interest to do so. See Bill Harten, Genealogical Data Communications Specs 24 Jan 1996.

GEDCOM was first proposed at the National Genealogical Society (NGS) Conference held in Salt Lake City in July 1985. For an interesting commentary on the impact of GEDCOM, you may wish to read an article by Dick Eastman, "GEDCOM Explained" published back on 4 August 2008. As Dick Eastman graphically explains, the main reason that GEDCOM exists is to avoid the need to re-key genealogical files when moving data from one program to another. Those of us who began entering genealogical information into computers back in the early 1980s probably re-entered our data a number of times as computers, software and storage media began to change. The early command line programs in CPM or DOS were difficult to use and very limited. For me, the revolution in software for genealogy came with the introduction of Personal Ancestral File for the Macintosh computer back in about 1987. I kept updating the program until 1994 when the final Macintosh version, 2.31, was released. Unfortunately, from 1994 on, even though the program was not updated, I continued to update my computer systems and eventually, moved back to using PC based computers for the main reason that I could run newer genealogy software. In 2000 both Personal Ancestral File 4.0 and the newer Windows version 5.0 were released.

Fortunately, these moves were accomplished after the time that the GEDCOM standard 4.0 was released in 1989. Moving my data from computer to computer and from program to program was possible only because of the GEDCOM standard allowing the programs to essentially "talk" to each other through the data files.

All of this came to a end following 2002. At this time, the last update of Personal Ancestral File was released and so within a year or two, I was looking for an alternative software solution. There were other things going on at the same time in the genealogical community. Just before that time, Elizabeth Shown Mills had published her influential book, Evidence: Citation & Analysis for the Family Historian. [Mills, Elizabeth Shown. Evidence!: Citation & Analysis for the Family Historian. Baltimore: Genealogical Pub. Co, 1997.]  Along with many others in the genealogical community, I became more and more aware of the need for proper and complete source documentation. As the early 2000s slipped by, I searched further and further afield for the "ultimate" genealogical database program. As I tried program after program, each time I transferred my core data by GEDCOM.

One thing began happening as the decade progressed. My personal requirements for a genealogy program became more and more refined and the programs got better and better. Sadly, neither Personal Ancestral File nor the GEDCOM standard kept pace with the changes in the genealogy world. Just recently, Ancestry.com released its first version of Family Tree Maker for the Macintosh. In moving data into that the program from previous programs, it became more than obvious that GEDCOM was no longer working as a standard method for sharing data. Although most of the data in my existing programs came across without problems, the source citations were an utter failure. The sources come across in one line of text, all of the categories and differentiation of the sources was lost.

It is apparent that if the genealogical software community is left to its own devices, the various programs will become more and more insular, especially since there is no overriding reason for the various programs to exchange data files. Ironically, the thousands, perhaps millions, of Personal Ancestral File users have no problems in this regard. Almost every popularly sold genealogy program on the market will adequately take all of the information from a PAF file and import it into its own data structure. But if you want to move from one of the more commercial programs to another today, you either have to rely on the programs' own initiative or suffer the loss of some data when you transfer a GEDCOM file.

Like my very recent experience with moving some information from Reunion for the Macintosh to Family Tree Maker, we are being forced back into earlier times when we had to re-key in data when the programs changed. For that reason, I applaud any initiative that moves towards a solution to the data isolation problem. If you would like more information about the status of GEDCOM please visit the Build a Better GEDCOM Wiki.

2 comments:

  1. Thanks for the shout out, James. My reply became so lengthy, I decided to post on the subject.

    http://blog.dearmyrtle.com/2010/11/do-we-need-genealogy-sieve.html

    ReplyDelete
  2. Thanks for bringing light to this incredibly important, core issue for genealogists!

    Thoughtful attention to this issue is exactly how we're going to solve this problem and move forward.

    ReplyDelete