RootsTech 2014

Some people eat, sleep and chew gum, I do genealogy and write...

Tuesday, April 10, 2012

Organizing Genealogy? Fact or Fiction

What does it mean to be "organized?" Yesterday, one of my friends who is very actively pursuing genealogical research announced to me that she was now "organized." She had three-ring binders with all of her family information in sheet protectors and with yellow highlights on each photocopied or printed sheet of source information. As we sat down to analyze some of her questions, she kept flipping back and forth through the pages, frantically trying to find the information she wanted to show me. I must also mention, that she was working only two or at most three, generations back in time. Did her notebooks and sheet protectors make her organized?

In my friend's case, neatness and compartmentalization were being used as an excuse for organizing the information in a format that made the information available and functional. Just because you pick up the mess and put it in a drawer, does not mean you are organized.  Of course, I sit here with my piles of papers at the other end of the spectrum. My files are not "organized" in the traditional sense, neither are they neat or any thing that might go along with that type of description. Are my piles disorganized? The real question is this; how long will it take me to find one particular document?

If you were talking about this in the image and document retrieval area, you would speak of the mean time to do a search. From a utility standpoint, the optimal organization should produce the lowest (or most efficient) time to do a complete search of the contents. The most disorganized structure of the data would require a complete search of the database every time a search was made. At some point there is an optimal organization when the time to search and find a given item is optimized and thereby minimized. At this point there is no further reason to organize the data and, in fact, further organization may lead to increased search times.

Let me give an example of two approaches to genealogical organization; attaching a copy of each related and pertinent document to each individual to whom the document pertains or classifying documents by category, such as putting all of the related documents together in a folder. If you are using a paper based data system, the first type of organization is wasteful. Imagine a Census record showing ten to fifteen related individuals on the same page. In the first type of organization the "one to many" document must really be duplicated for each individual. That means 15 individuals on the same document, 15 copies. So why not put a "reference" to the document on each individual record and store one copy of the document for reference. That means every time you want to look at the document, you have to do at least two searches, one for the original record and then one for the copy of the document.

Hmm. Sounds like we need some way to automate the search. How about using a computer? My friend likes to "see the document on paper." She claims that it helps her visualize the information better. But here is the flaw in her thought process. She was not visualizing the information at all, she was shuffling through the pages trying to find the documents she was looking for and in addition, has another problem; she did not know what information she had.

My solution to the problem is relatively simple. It is essentially combines the first system and the second system at the same time. You have one copy of the document in a computer file and you link every individual that pertains to that document to the original in the file. One original and many links.

The second part of this "organization" is making sure that every document or image in your database is not only linked to an individual, but also has pertinent and identifying metadata. Then instead of searching through either a pile or a notebook, you do a global search on the computer and find the document in seconds. At this juncture, I have to gloat. I use Macintosh computers and the Spotlight program can search a huge 3 TB of information in seconds, I am not quite so happy with the Windows search capabilities.

Probably more about this later.


1 comment:

  1. I agree with everything except the linking part. That's a thankless task unless you've absolutely decided what your image-naming convention and folder structure is and you're NEVER going to change it.

    ReplyDelete