Some people eat, sleep and chew gum, I do genealogy and write...

Tuesday, June 8, 2021

Why there is a GEDCOM Standard and why we need a new Version 7.0

Yes, you guessed it, more history

Note: You may want to go back and read my first post about the release of FamilySearch GEDCOM Version 7.0 (hereinafter GEDCOM Version 7.0) entitled, “Introducing FamilySearch GEDCOM Version 7.0 a long-awaited upgrade.” It may help you have a better understanding about this post. 

The first rudimentary desktop or personal computers were “invented” (more correctly assembled) beginning in 1974 with the Altair 8800 which is usually acknowledged to have been the first commercially successful personal computer, see “IT History Society.”  In 1976, Apple Computer (now Apple) released the first Apple 1 desktop computer. Genealogists were some of the earliest “power” users of desktop computers, but it took some time before the first desktop computers had enough memory and storage to support sophisticated genealogy software. The IBM Personal Computer or PC debuted in 1981. In 1984, Apple released the first Macintosh computer.  One of the earliest genealogy software programs was Ancestral Quest. Ancestral Quest went on to become the basis for the Windows versions of Personal Ancestral File released by The Church of Jesus Christ of Latter-day Saints in 1984. The first Macintosh version of Personal Ancestral File was released in 1987. The competition began revolving around two operating systems when Microsoft released Windows 1.0 in 1985.

Now there was a problem that developed concurrently with this rapid development in computer technology. There were two main competing operating systems and yet no practical way to connect two computers together or exchange genealogical data between the competing systems. It became apparent almost immediately with the development of more sophisticated genealogy software that there needed to be a way to transfer the data from one computer to another, i.e., from one desktop computer to another and from one operating system to another such as from DOS/Windows to the Apple OS. In 1984, the Church released the first version of GEDCOM or GEnealogical Data Communication. From 1984 to 1996 different versions of the GEDCOM Standard paralleled the technological advances in computers. 

Fast forward to the present. The world of computers has become unimaginably more complicated that it was back in the 1980s, but we are still faced with the same basic problem: moving genealogical data from one operating system to another and from on software program to another. 

GEDCOM is not a program. It is a standard. What this means is that programmers who follow this standard or at least adapt their program or website to take advantage of this standard allow their users to share and exchange data with other programs or websites. The standard is a specification of the programming that will allow this interchange. 

Here is a quote from the FamilySearch GEDCOM website, GEDCOM.io explaining this concept in more detail. 

FamilySearch GEDCOM is relevant to create a personal private backup of family tree information, maintaining local ownership and control. A FamilySearch GEDCOM file is a UTF-8 text file containing genealogical information about individuals, and also meta data linking these records together. The standard file extension used is a suffix “.ged” to indicate the file has been formatted using the FamilySearch GEDCOM specification. Hundreds of software products support the reading and writing of GEDCOM files. Individuals continue to share their files for collaboration, reports, charts, special analysis, and other innovative purposes. The FamilySearch GEDCOM file format allows users to preserve, collaborate, import, and export with different applications while maintaining control of the original copy. FamilySearch GEDCOM version 7.0 is the most recent update to GEDCOM.

What was the basic challenge of GEDCOM in 1996?

In March 1989, Sir Tim Berners-Lee laid out his vision for what would become the- web in a document called “Information Management: A Proposal”. In 1995, commercial use of the existing network between computer became unrestricted and essentially, the internet was born. With the advent of an open internet, (See Wikipedia: History of the internet) the World Wide Web exploded. Programmers had too much to do to worry about any limitations in the GEDCOM Standard and in any event, the GEDCOM Standard had advanced to the point that it was serviceable given the technology then available.

As time passed, the internet became more and more complex. Genealogy software began incorporating the ability to attach and store photos and digital documents to individual entries. In May of 1999, The Genealogical Society of Utah, the predecessor of FamilySearch, opened the FamilySearch.org website to the public. For those of us living through all this hyper-speed technological change, it became difficult to even begin to understand all of the products and devices that were being developed. 

Meanwhile, we started digitizing nearly everything having to do with the storage and use of genealogical records.

Scanning technology predates computers by many years. Scanners come from the wirephotos that were invented beginning in 1913 but scanning only really became possible for personal use in the 1970s and the first 300 dpi scanner was introduced by Microtek in 1985. Digital images took up a lot of computer storage space so sharing digital images only became possible when computer technology and memory storage technology became and practical reality for individual desktop computers. 

The challenge for GEDCOM was that as digital images were added to genealogy software, because of the limitations on data storage, it took some time for the technology to develop that would allow the transfer and store a large number of images. Meanwhile, the programmers and developers were trying to work out the details of storing billions of photos online. 

What happened to enable the development of the GEDCOM Standard 7.0

With all the tremendous technological changes, the basic issues were quite simple to understand; storage capacity, speed, and the cost of both. Where are we today compared to where we were in the past?

I am far from typical, of course, but whether you are technology-challenged or a power user, the technology is still available. Let’s see about some of the prices. 

In 1976 when the Apple I was introduced it cost $667. Adjusted for inflation, today it would cost $3,066.35. When I bought an Apple II computer in 1977, it cost $1,298 which comes out to $5,602.86 which is much more than I spent for a new 10-core iMac this past year. One more example, the Apple Macintosh was introduced in 1984 for a price of $2,495. Adjusted for inflation, that is the equivalent of $6,281.49. See U.S. Inflation Calculator.

I think the best illustration of the change involves a single digital photo. Back in 1981 a gigabyte of storage cost about $500,000. See “Hard Drive Cost Per Gigabyte.” I now routinely by 8 Terabyte hard drives that are now selling for under $200 or about $25 for a terabyte or 1000 gigabytes of storage. So, my actual cost of storing a gigabyte is $25 divided by 1000 or about $.03 per gigabyte. Oh, by the way, a 3.5” floppy disk could store 1.44 megabytes which is actually less than the memory size of one of my digital photos. 

The other main issue speed. Again, I am not anywhere near the average but here in Provo, Utah we have Google Fiber Internet and I have a very high-speed connection. 

Anyway, the issues are clear. It is now time to update the GEDCOM Standard to Version 7.0 with GEDZip and begin to take advantage of the high storage capacity and high speed and a relatively much lower cost for both. 

What will the GEDCOM Standard Version 7.0 do?

For the average genealogist using a relatively recently upgraded computer, the new GEDCOM Version 7.0 will only be available as the genealogy software companies and websites implement its use to enable the genealogists to use it.  First, it is a standard. That means that developers and programmers must decide to incorporate the standard in their software so that genealogists can exchange copies of the documents and images attached or reference in the software or websites they are using. 

Right now, if you were to subscribe to one of the major online genealogy family tree/database websites such as Ancestry.com, you would be able to upload your basic genealogical data from your desktop software program using GEDCOM but that would not include any of your digital images including photos. You could upload your photos one by one, but then you would have to tag or attach them individually to your new family tree. 

The idea behind updating the GEDCOM Standard to Version 7.0 and adding the ability to support external images using GEDZip, a file compaction program, is that a genealogist can upload or exchange files that include all those records, documents, and photos already attached and with copies included. 

What else do I need to say? 

Actually, I need to say a lot and I will keep writing. Look for additional blog posts as the GEDCOM Standard 7.0 with GEDZip.

FamilySearch GEDCOM 7.0 is copyrighted.

© 1987, 1989, 1992, 1993, 1995, 1999, 2019, and 2021 by Intellectual Reserve, Inc. All rights reserved. A service provided by The Church of Jesus Christ of Latter-day Saints.

General information can be found at GEDCOM.info.

Helpful Sources

General Info: GEDCOM.info

Technical Specs, Tools and Guides: GEDCOM.io

Community:  GEDCOM General Google Group and GitHub Public GEDCOM Repository

Email: GEDCOM@FamilySearch.org


6 comments:

  1. Assuming Ancestry, MH, and FMP etc adopt GEDCOM 7.0, would that mean their tree downloads would also include the images etc?

    ReplyDelete
    Replies
    1. Yes, if they adopt the new Version 7.0 and GEDZip. GEDCOM is a standard not a program.

      Delete
  2. James, first of all hello to you and Ann! Hoping you are doing well!
    This is a phenomenal post. Thank you for sharing it. I actually felt like I understood way more about GEDCOM and it makes so much sense! Hi to TFHG friends we share!

    ReplyDelete
    Replies
    1. Thanks Bonnie, hope things are going well for you.

      Delete
  3. James, One thing I wish you would talk about at some time soon regarding GEDCOMs: Please tell all that a GEDCOM is not the same thing as attaching sources. So often when I am attaching a real source like a birth, marriage, death, census, etc. source, I come across a note in sources or on the Vital
    Data that says this is "verified with my GEDCOM file." ARGGGG!

    ReplyDelete
    Replies
    1. In this context, a GEDCOM file is no more than a stored or archived file. GEDCOM is the format of the file that is stored. GEDCOM has absolutely nothing to do with the validity of the data stored. It also has nothing to do with sources other than as a means of transferring sources, if supported, from one program to another.

      Delete