RootsTech 2014

Some people eat, sleep and chew gum, I do genealogy and write...

Sunday, June 22, 2014

Back up your genealogical data files -- a reasoned approach


The recent Denial of Service attack at Ancestry.com and FindAGrave.com point out the vulnerability of the Internet yet in again in a very graphic manner. Although this attack is over, there is nothing stopping another such attack at another time and place on the Internet. During the past few years, the "Cloud" a euphemism for the Internet or Web, has been touted as a solution to individual backup challenges. It turns out that storing your data online is subject to some of the same risks as any other method of data storage. It is time to evaluate each type of storage media as to its merits and limitations.

A computer system consists of several integral parts that work together. Traditionally, the computer hardware consisted of a CPU (Central Processing Unit) chip on a board with other processors, a power supply, connecting cables called busses, perhaps a fan, a memory storage device and box to put it all in. Then you start adding things like keyboards, a mouse or pointing device and you have a computer system. Today, all of that can be packaged in a smartphone. I am focusing on storage. We moved from recording tape in cassettes or on spools to floppy disks to hard disks to flash or solid-state memory in just a few short years. At the same time, there were all sorts of variations including hard drive cassettes and CDs and DVDs. Where are we today?

As genealogists, unfortunately we are all over the board. We have a core of people who are still holding on to their floppy disk storage and we have those who use the latest and best storage methods.

Now a word about the "Cloud." There is no magic place to store your information called the "Cloud." The cloud is nothing more or less than a bunch of computers with hard drives attached. So when you are storing your data "on the cloud," you are really just using someone's computer and hard drive, usually in a huge array of computers called a server farm, like the one pictured above. Now, these computers are subject to all the same failure possibilities of your own computer in your own home. The difference is that they have people working 24/7 to replace the computers and hard drives as they fail. They keep the backups. But what if the whole server farm fails? Exactly the problem. The key to safety is redundancy or in other words, multiple backups.

The same rule holds true whether you are storing photos on a smartphone or genealogy on your computer's internal hard drive. You need to have multiple copies of your data on different storage devices. All of this costs money. But you always have to place a value on the time it has taken you to accumulate your data. I have people come to me crying because they lost their data. Upon questioning them, it turns out they have 40 or 50 names (or some other very limited amount) of data and are devastated because of the prospect reconstructing those 50 names. Think of the real world where people like me have more than 3 Terabytes of data and over 300,000 files accumulated over the last 32 years. How do we back up this data?

Here is where we are today.

A 4 TB (Terabyte) hard drive cost just $150 on Amazon.com. That is how I backup my data. I have multiple hard drives and make copies on each. But what would happen if a meteorite hit my home (or something more predictable)? I make periodic copies of all the data on another hard drive and give a copy to one or more of my children. How often do I back up? Every time I think about the effort it would take to reproduce all that data.

Even if you don't have this massive problem, the answer is the same. You back up your data on multiple drives and keep them in multiple places. I don't rely on Cloud storage so much because of the massive amount of data I have and the cost to keep it online. I do use Internet or Cloud storage for some types of data, particularly those items I use regularly on a variety of computers.

What are the hardware options? Hard drives are still the most cost-effective way to store computer data. Flash drives (thumb drives whatever) are reliable and convenient. If your data fits on a flash drive, go with it. But they are easy to misplace so be careful. Flash drives may someday replace hard drives (mechanical spinning media) but that is still a ways off. CDs and DVDs? They are still a way to backup data but they are slow and limited in size. They are also on their way out. Most new computers are not being made with CD or DVD drives, this is always an indication of the end of a particular form of storage.

What about storing data online (the Cloud)? Good idea but not any more reliable than hard drives or flash drives. The best idea is to use more than one method and make sure the copies are in more than one place. I use all these types of storage for different reasons at different times. I have hard drives (very large ones). I have flash drives (very large ones). I have CDs and DVDs but I am transitioning away from them as fast as I can. I have some storage online but balance the availability against the cost and the size of my storage needs.


1 comment:

  1. Excellent, just what I needed. No hugh data base but I don't want to loose what is on my computer. So a hard drive in a separate location together with a local backup makes sense.

    ReplyDelete