Some people eat, sleep and chew gum, I do genealogy and write...

Monday, October 21, 2013

Yes, Genealogists Really Need a Database

In a recent post, Blogger Tony Proctor asks an almost rhetorical question, "Do Genealogists Really Need a Database?" The post is slightly technical but explains how and why accumulations of information in computer programs work the way they do or don't. I recommend reading this post for a basic understanding of the challenge of keeping large amounts on information in computer files.

Currently, one of the challenges of genealogy is that the data files we are creating on our computers include scanned documents and images. In the pre-media days, genealogical data files were almost exclusively text files. If you have been keeping your genealogical data on a computer with a genealogy program such as Personal Ancestral File, you might be aware that the resultant files were very small. What I mean by "small" is that they took up very little memory space. For example, even if your file had tens of thousands of names, you could store that file on two or three or maybe more, floppy disks. Personal Ancestral File had a utility that would break the file down onto separate floppy disks.

You might remember that a double-sided, high-density (HD) 3.5 inch floppy disk only held 1.44 MB. That's Megabytes not Gigabytes. A Gigabyte (GB) is 1000 Megabytes (MB). So, a 1 GB flash drive holds almost a 1000 times more information than a 3.5 inch floppy disk. Today, we can commonly buy Terabyte (TB) hard drives or drives holding 1000 times more information than a Gigabyte drive. To put this into context, the size of one of my smaller sized photographs is about 5 or 6 MBs. It would take 5 or 6 floppy disks to hold one JPEG photograph even if that were possible, which it isn't.

The rest of the genealogical challenge is that with these huge files sizes we accumulate a lot of diverse types of files; text files, JPEG files, TIFF files, DOC files, PDF files, just to name a very small sample. The programs have to be able to import and view all of those various file types and keep track of their storage locations on our hard drives. If you have been using a genealogical database program for a while, you likely know that keeping track of all that information is a challenge.

Right now, we have no way of completely transferring all of the information in our genealogical data files including all of the attached media files (photos, documents etc.) from one program to another. What Tony is talking about, in part, are all the technical issues involved in transferring that data from one computer program to another. There are several reasons why this is desirable. The primary reason is that we do not want to lose data to obsolete programs.

If you have your genealogy on 3.5 inch floppy disks in an older program such as Personal Ancestral File (PAF), you many have noticed something. There are no more floppy disk drives around except on very, very old computers. There is a huge challenge in transferring those old files into a format that can be used on a new computer. This is especially true if the the old PAF file was stored on multiple disks. I might also mention that even if the old PAF file was on multiple floppy disks, that file did not have a copy of any attached media files such as photos or documents. Those were not copied when you backed up the file or put it on multiple disks. So, in most cases, the media files have been lost.

Today's genealogy programs, for the most part, do not make a copy of the attached media files. So even if we come up with a way to transfer the data from program to program, we will still be faced with the problem of including a way to copy and find the attached media files. All of this takes more and more computer memory, hence back to Tony's discussion.

One option that is just now starting to be viable, is to put all your information into an online programs such as's Family Tree. But as you probably know if you have been reading this blog, that is an entirely different challenge.

All of this discussion is aimed at the issue of preserving our genealogical work for future genealogists. That is the core concept and core goal of these types of discussions.


  1., which genealogy database program do you use?

    1. All of them. I usually have up to ten different programs at a time.