Wednesday, November 10, 2010

What do I do with all that genealogy stuff once it's in my computer?

Thanks for asking! One commentator took pity on my plight of unending scanning and questioned me about where I put all the stuff and keep it organized. Very interesting questions. Very difficult answers.

Have you ever looked at ads for computers? Have you ever noticed that there are no cables in any of the pictures? Try it now. Go to Best Buy or Amazon and look for some kind of computer. Any kind will do. If you can't think of one, try a Dell 3.0 Ghz GX. Do you see any cables? Now, what do you see when you are invited into my computer room? Beautiful computer systems without cables? You bet! Everywhere you look you see cables. Piles and tangled strings of cables. I have 4 surge suppressors (power strips) full of plugs. Do you think all of those cables are nicely organized? Neatly coiled and marked and identified? Well, not.

Computers are basically really messy. I know people who will not allow them in the house for that reason alone. What does this have to do with scanning? First of all, I have piles and piles of documents. The real boon to my life of storage was digital cameras, voila! no more negatives, slides or photos. But most of the genealogy records I have acquired have corresponding paper copies. So where are the paper copies? Everywhere. Under, over, in closets, in cupboards, under chairs, piled in corners. If you are obsessively compulsive about cleaning, you would have a nervous breakdown looking at my piles. Then I have the inherently messy computers with cables running in every direction. So guess what happens to all of the files I scan? Yep, they get added to the pile.

At almost every genealogy conference I have ever attended, there has been a class or two on organizing your genealogy records. I am going to admit that I have looked at probably dozens of different suggested schemes. But basically, I have always returned to the basic pile. My philosophy is let the computer do what computers do and don't stand in the way very much. One of the first jargon terms I learned when I started with micro-computers, was "Random Access Memory" or RAM. Here is the concept, the memory is not particularly organized in any way. The an integrated circuit memory chip allows information to be stored or accessed in any order and all storage locations are equally accessible.

All physical organization systems are lineal and sequential. My mind does not work either lineally or sequentially. If I put something in a folder in a filing cabinet, someone else has to set up the system and make me use it. Otherwise I would use the glacial organization system, newer stuff rises to the top, older stuff sinks to the bottom. When stuff reaches the bottom of the pile, it is either historical or garbage. OK so how does all this more relate to genealogy.

As I said, the key is letting the computer do what the computer does extremely fast and well. That is, keep track of stuff. What do I have to do? Name the stuff and give each item keywords as metadata. For example, if I have a deed from Francis Tanner in the 1700s, I could store that in a folder called Tanner, then in a sub-folder called Francis Tanner and ultimately in some folder called deeds or whatever. However, it is much simpler that that. I scan the deed. I name the file something like "Francis Tanner Deed 1768" and then I give the file a number of keywords, perhaps list all the people mentioned in the deed, the location, the dates, the recording number or whatever, and then just stick the file into one huge folder called "Images for all programs" or something like that.

Since I use Apple Macintosh computers running OS X Snow Leopard, I can now use my handy find command (or Spotlight) to find any file on my computer (or in three Terabytes of storage) in about 2 seconds or less. So what does it mean to be organized? It means that all of the documents and photos on my computer are named and have attached metadata. If I need all of the documents with the name Henry Martin Tanner, I just type in his name and there they are, all 300 items, instantly. So why do I need to put all those files in little color coded (or whatever) folders? I really don't.

What if I need to find the original document somewhere in the huge pile of stuff I have all over the house? Good Luck. That is why I am scanning everything. That is why I have already scanned over 70,000 documents and keep scanning and naming files all the time, obsessively. So I don't have to remember where I put it.

Would this system work for you? Probably, you are saying that I need counseling and that I will come to a no good end. But, wait, the next time you have to organize about 100,000 documents, how about giving each one a unique number, name or metadata and let the computer do the work. Isn't that what computers are for? Hmm. Do you think I know what I am doing? (Or doubt my sanity?) Perhaps you would like to read about Library of Congress' experience as an early adopter of the OAI Protocol for Metadata Harvesting. Of course, there are vast differences between what the Library of Congress is doing and what I am doing, but the similarities are striking.


  1. You have put a saner slant on my recent insane posting regarding organization.

    Obviously I deserve compensation or credit somehow. You're welcome. ;-)

  2. LOL - I have the same problem! I recently bought a 5' tall 4-drawer filing cabinet. All my scanned documents are named in the same order - B for Birth, or D for Death (M for Marriage, C for Census etc), followed by the ancestor's year of birth or death, followed by the ancestor's name. I have other "codes" for Newspaper clippings, Monumental Inscriptions (from headstones), Wills & Testaments, but they all follow the same routine and I file them as such. I can then go to my Source Summary (I use Family Historian as my default database) and print to PDF a searchable Master List of Source Documents, and I file the docs in this order. The name in the PDF matches the file in the drawer in the cabinet, so I can pull the original out easily. I can also attach a scanned copy of each document to each citation in the database, which means I never have to pull out the original in future, providing my database is OK. The only improvement I can see would be to do away with keeping the originals, but I can't see me disposing of copies that I paid for anytime soon, and there's always the shelf life of 'CD's/DVD's to think about :-)

  3. I've been thinking about this post for almost a week now and am close to taking the plunge. However, how do you handle a group of photos that, in my opinion, need to be contained as a collection? For example, I have several 19th century photo albums in which only some of the photos are identified. I revisit them on occasion and the fact that I know that a particular photo was in Grandma So-and-So's album has helped me narrow down who may be the subject. Looking at them in the context of a collection is important to me, I think.