Pages

Tuesday, November 10, 2015

Using Microsoft Excel® in Genealogy

Developing a spreadsheet is a powerful way to view and analyze large amounts of information. Most people associated spreadsheets with business activities and accounting. But the powerful analytic tools available in programs such as Microsoft Excel® can also be applied to some types of genealogical research. Knowledgeable genealogical researchers are using spreadsheet applications to find patterns in their family lines that lead to breakthroughs discoveries of previously end-of-line situations. These remarkable results have been facilitated by the vast online collections of indexed source records.

One of my friends recently spent over two hours showing me some advanced methods of entering and analyzing huge lists of potential ancestors and then using the power of the spreadsheet applications to extract target families from the mass of data. In talking about my experience, I have found other very sophisticated genealogists who have been doing the same thing from some time now. The results of using a spreadsheet application are only limited by your own ability to construct spreadsheets and analyze the data.

Of course, if you don't know how to use a spreadsheet program, you will probably not be in a position to utilize this powerful method of teasing out specific family information from huge online databases. This points out an important fact about genealogy. Many of the advances in technology are being underutilized by genealogists simply due to the fact that they have no idea how to use them.

Using Excel spreadsheets is not a particularly new or innovative idea. Here are some links to articles and posts about the process:

However, the program is much more powerful than simply using it to fill in paper-based forms. The entire technique involves copying large amounts of data from a search, for example, on Findmypast.com involving all of the individuals with a target surname in an entire parish and then resorting the results to cluster dates and specific places. Using this kind of technique, you can begin to identify family groups. Here is that example in specific steps.
  1. Log in to Findmypast.com (or a similar large online database).
  2. Enter search terms in a general way, such as entering a surname and a parish name. In this case, I entered the surname "Parkinson" and the place as "Huntingdonshire."
  3. I got 360 results which I highlighted, page-by-page and copied into an Excel Spreadsheet.
Here is what I got when I pasted the first page into a spreadsheet:


As you can see, the entries need some considerable work at this point. The first part of the cleanup involves auto-fitting the entries.


Next, I would separate the comma delimitated place names in the place column into separate columns. But even though you don't do that, you can sort the entire data sheet by the place names rather than the default alphanumeric sort on the names of the people. I had to unmerge some cells to do this which points out the major issue with using a spreadsheet to manipulate genealogical data, you have to know how to find the data and you have to know how to work with a spreadsheet. Here is what it looks like when the data is sorted.


If I were to add all of the results of the search I would then have all of the entries in order by the place. Perhaps you can see the immediate value of this. Now, what happens if I simply search for a specific place in Findmypast.com? Then, of course, I would not have a list of all the Parkinsons in that parish to explore. 

What I have found with this sort of analysis is that either you see the advantage immediately or you have to be shown with specific examples. Unfortunately, I am not able to sit down beside all of my readers and give that sort of assistance. All I can say is I am available at the Brigham Young University Family History Library most anytime.


 

4 comments:

  1. Well done ! I have used this approach to "extract" christening data from a parish, for a given name, then sort by parent names = family groups. This helps me see not only families together, but also to see gaps in the birth ages, which MAY suggest there was an additional child (or two).

    ReplyDelete
  2. i use a spread sheet to format my sources. I have set up a table with several sheets. One for each type of source. Then all I have to do is enter the information and it formats the source and I can copy and paste to RM7. It has saved me lots of time and makes my sources consistant.

    If anyone wants a copy it uses LibreOffice and I would be glad to share them.

    Roger

    ReplyDelete
  3. Great article about the powers of Excel for genealogy !

    ReplyDelete
  4. James,

    I want to let you know that two of your blog posts are listed in today's Fab Finds post at http://janasgenealogyandfamilyhistory.blogspot.com/2015/11/follow-friday-fab-finds-for-november-13.html

    Have a great weekend!

    ReplyDelete