Some people eat, sleep and chew gum, I do genealogy and write...

Thursday, February 24, 2011

Backup? Archive? What should I do?

If you haven't lost data, you are either obsessive and compulsive or haven't spent much time on a computer. The facts of computer life include having programs crash, losing power through blackouts, losing your flash drive, and almost an infinite number of other ways to lose your work. Genealogists are not immune to losing data. Lately, I have been thinking a lot about data preservation. The larger your database the more you tend to think about what would happen if it all crashed.

There are two different terms that are usually mentioned when anyone talks about data preservation; backing up your data and archiving your data. Although both of these concepts involve similar activities, there are some significantly different conceptual implications between the two terms.

Backing up information implies making a copy of a working file so that in the event of a power outage, disk crash or other catastrophe, the file will still exist in some form or another. It seems obvious, but a copy of a file does not become a backup unless it is disassociated with the original so that if the original is destroyed, the copy is not also destroyed. Let me put that another way, a backup only becomes a real backup (no matter what you think or what it is called) if the data is on a different and not dependent media. If I am working on a file and the program stops me periodically and says something like "backing up your data" that statement is only true if the copy being made goes onto an entirely separate storage media, such as an external hard disk or a flash drive. In this sense, a backup by definition is different than a program's file saving function. For example, if I am working in Microsoft Word and have selected automatic backup, the program will automatically save my work as I enter data. If my program crashes or the power goes off, I can recover the document I was working on. But in this case and all others, if the computer hard drive crashes or is stolen everything on the drive may be lost. The fact that an individual program has "backed up" my file is meaningless.

To repeat, a true back up of a file or files has to exist independently of the original. So you will need to have some kind of external storage device to receive the backed up files. What would happen if your house burned down? In order to be even further protected from loss, a backup copy should be not only on a different device than the original, but in a different location. There are any number of media that can be used to backup a file; external hard drives, flash drives, tape backup, DVDs, CDs and a few others. Each has its limitations and each has its merits. But once the copy is made, the copy should be stored in an offsite location such as the house of a family member or friend.

So what is an archive copy and how is an archive different than a backup? Physically they may be the same but the difference is the intent. If I am going to archive a file, I do not intend to use the media repeatedly. Once I have verified that my archive copy is good, I will put the archived copy in a very safe location and leave it there to remain undisturbed. Archived records are usually those of a historical or reference nature. So how would I make an archived copy differently than a backup? You wouldn't. But you might use and reuse the backup several times, but you would only use the archived copy if the original of any were no longer available. You would then make a copy from the archive and return the archive to storage.

The static concept of an archive is fine for physical files, books and such, but, there is a huge exception for electronically stored data. Anything stored in a digital format can be lost over time. In every case the data must be migrated (moved) to a newer media or format frequently to make sure the information is still readable. All electronically stored data should be updated whenever there is a change in operating systems or whenever there is a software upgrade. So the difference between a "backup" copy and an "archive" copy is somewhat blurred. I make the distinction by replacing the "archive" copy with new migrated media periodically. Over the years, I keep dragging my data along the preservation path, one new hard drive and one new program at a time.

It is great if you store your old genealogical database files in a safe place, but every time you upgrade your program, you also need to upgrade your stored files. By the way, this is a lot easier to say than it is to do consistently. Think about making regular backup copies of all your work, but also consider adding an additional archive file to the mix to make sure the first copy does not also go bad.

No comments:

Post a Comment