Pages

Monday, February 28, 2011

Quick Correction on last post on New FamilySearch

Please be aware that New.FamilySearch.org is no longer a usable program. If you come to this blog post. please click on the logo above and see the latest posts on FamilySearch and other issues. This post is long out of date. Click Here for the latest information. 


A comment from a reader cleared up a slight mystery, for me anyway. New.FamilySearch.org has an update link on its startup page. Here is a screen shot of the page with an arrow showing the link:



This link takes you to a page that is updated from time to time with changes in the program. However, the most recent page was not the one referred to by The AncestryInsider. That page is located on the home page after you have logged into the program. Here is a screen shot of the location of the second update page:

This second link is to a slightly more informative PDF file that contains the information. Thanks for the heads up. I will remember to check both places in the future.

New FamilySearch goes public?

Thanks to The Ancestry Insider (AI) for another scoop. His post today is entitled, "NFS Public Release Begins." He quotes and unidentified individual at FamilySearch as saying:
“In February 2011, we’ve invited a limited number of public users to begin testing public access to the new FamilySearch [Family Tree] website,” disclosed FamilySearch. “These valued testers will help us make sure the system can handle the increased load.” Don’t bother begging; FamilySearch has already selected the testers.
I am not sure that this constitutes going public, hence the question mark in the title. The AI is correct about the confusing terminology. I have begun calling the FamilySearch.org website released in December, 2010, the "updated FamilySearch.org website." This is to try not to use the term "new"  which is getting pretty old now anyway. Technically we aren't talking about a new site at all. FamilySearch.org has not changed URLs and is still tied to the old FamilySearch.org website. The site we all call "New FamilySearch" should properly be called FamilySearch Family Tree or something of that nature.

New.FamilySearch.org (NFS) does have an entirely new User Profile and Preferences. All of the old Preferences have been re-set to hide names, addresses and phone numbers. Unless users go back into their preferences and change those items, they will not be shown. I have found it valuable to have my contact information in New FamilySearch and so I opt for having my info made public.

I agree that those coming into the system will have a different experience than those of us who already have generations of ancestors mangled by the program. I find that anyone who has little or no previous contact with families listed in the New FamilySearch program, will have an easier time using the program than those of us who have been in the system for years and years dating back to the Ancestral File and before.

As for AI's comments about SCOE, I am not sure I see anything remotely moving in that direction yet. By the way, SCOE stands for Source Centric, Open Edit. Right now, NFS is a long way from SCOE.

Sunday, February 27, 2011

Understanding Topographical Maps for Genealogists -- Part Two

In my first post in this series I asked a question, "What state lies to the south of Arizona?" I have yet to get a correct answer to the question. But there is a follow up question, "What is the name of the country to the south of Arizona?" The answer to the second question helps clarify the answer to the first question. Both questions point up the fact that there is an abysmal lack of geographic knowledge in the U.S. population. Hopefully, genealogists will be like stamp collectors and learn about all the countries of the world and a little history on the side.

To accurately identify someone you need to know three things; a name, a place and a time. This is deceptively simple however, names are difficult and sometimes elusive as are both places and times. Places can be best identified and pinned down using a map. You may think you know where your ancestors lived or were buried until you try to go there. You will soon learn that knowing the name of a place and finding that same place either on a map or in the real world, is entirely something else. Since it is given that you are going to use maps, you need to understand some basics.

Now back to geography and topographic maps. We are back to contour lines. Just in case, here is another example showing contour lines:


Contour lines represent a series of points of equal elevation. The map screen shot above uses both contour lines and shading to show the changes in elevation. As you can see, lines that are closer together represent steeper slopes. Other noticeable features of topographical maps are the grid lines. Grid lines give the map makers the ability to represent any point on the ground. Each of the grid lines has a number, usually corresponding to latitude and longitude. The intersection of any point on the map is a combination of the two numbers. In addition, by looking at the contour lines, the vertical altitude of the point can also be determined. Here is another screen shot of a aerial map showing the numbered grid lines:

 Historically, there were four different types of map grids used in the U.S.
  • Geographic
  • UTM
  • State Plane
  • Public Land Survey
The Geographic Coordinate System is based on degrees of latitude and longitude. Latitude lines run parallel to the equator and are divided into 180 equal portions from north to south. In the northern hemisphere, the lines are numbered from 0 degrees to 90 degrees north and in the southern hemisphere the numbers are from 0 degrees to 90 degrees south.  Longitude lines (also known as meridians) circle the earth perpendicular to the equator. Major longitude lines are numbered in degrees with world maps showing lines every 30 degrees east and west of the 0 degree line or prime meridian which passes through the Royal Observatory at Greewich.  The opposite side of the world from the Prime Meridian is the International Date Line.

Most maps showing a small portion of the earth's surface represent the lines as intersecting at right angles. However, because they are really drawn on the surface of a sphere, they do not meet at exactly right angles except at the equator. At the equator, the distance between the lines of longitude is the same as the latitude. As the longitude lines are projected both north and south, they meet at the poles, so in reality, there is small difference in the width of any map segment defined by a longitude line.


Why is this important to genealogists? Well, in eastern Arizona for example, there was more than one land survey and because of cumulative errors introduced, in part, due to the curvature of the earth, the two surveys do not match up. If you are trying to locate a piece of property along the survey discontinuity, you will find all sorts of errors.

Because latitude and longitude are measurements on the surface of sphere, the lines are measured in degrees with a full circle of 360 degrees. To be more useful in locating smaller features on maps, the degrees of latitude and longitude are further divided into minutes and seconds. At the equator one second of latitude or longitude would be approximately 101.3 feet.

The Universal Transverse Mercator Geographic Coordinate System (UTM) is a pseudocylindrical conformal projection. (That really helps doesn't it). What you need to know about this type of representation is that this map projection is an attempt to depict areas of equal size towards the poles.  Most topographical maps published recently use the UTM system. In older maps the UTM system is shown along the edge of the map. Distances and locations in the UTM are measured in meters. The average circumference of the earth is 40,030,173 meters so that there are 10,007,543 meters north or south of the equator in each hemisphere. In order to make the numbers less imposing, the earth is divided into UTM Zones, each of which is 6 degrees of latitude wide, so there are 60 zones around the world. Here is a representation of the U.S. showing the UTM zones:
http://en.wikipedia.org/wiki/File:Utm-zones.svg

Quoting from Wikipedia, "A position on the Earth is referenced in the UTM system by the UTM zone, and the easting and northing coordinate pair. The easting is the projected distance of the position eastward from the central meridian, while the northing is the projected distance of the point north from the equator (in the northern hemisphere). Eastings and northings are measured in meters." For example, the geographic coordinates 43°38′33.24″N 79°23′13.7″W / 43.6425667°N 79.387139°W / 43.6425667; -79.387139 (CN Tower) translate into Zone 17 with a grid position of 630084m east, 4833438m north.

Well, one thing I learned writing this post is that Wikipedia is also in Navajo. (You don't want to know how I found that) Anyway, here is the point. Maps are essential to making accurate identifications of individuals in many cases. If you find your ancestor had a land grant or homesteaded, you can locate where they lived on a map and determine the likely location of any records that might have been generated as result of their living in a certain area. In order to adequately understand historic maps, it is essential that you have a basic understanding of map projections.

Next time, I will continue with State Plane and State Land Survey maps and we will also look at scale.

Saturday, February 26, 2011

Understanding Topographical Maps for Genealogists -- Part One

Not all maps are created equal. As a research genealogist you should be routinely looking at maps of almost every kind and description, old, new and online. In a time when Google Earth has packed a huge amount of information into every view, it is still necessary to look at other types of maps. Everything from Sanborn's old insurance maps to modern street level zooming maps of almost everywhere. You may find, as I did recently, that even though two families lived in different states, they really only lived about 15 miles from each other! Along with ignoring history or being ignorant of it, lack of map education is one of the great challenges facing the expansion of genealogy.

In my last post, I got a little emotional about the lack of history education. Well, now I am starting in on geography. For example, here in Arizona starting in 2012 this is the high school graduation requirements for students in social studies: Grad requirement effective for the graduating class of 2012 include: “b. Three credits in social studies to include the following: i. One credit of American history, including Arizona history; ii. One credit of world history/geography. iii. One-half credit of American government, including Arizona government; and iv. One-half credit of economics.” (A.A.C. R7-2-302.01). See Education Commission the States for your state's requirements. In four years of high school, students get one class on world history and whatever geography that is included! 


When I used to teach Spanish for five years at a local community college, I would usually start the year asking what was the name of the state to the south of Arizona? Ask yourself the same question and let me know the answer. In a class of over thirty Spanish students, usually not one person could answer the question. No, I am not going to tell you, go look it up.


This Blog is read almost entirely by genealogists. How many of you out there have had a topographical map in your hands recently? How many of you would recognize one if you saw one? OK, collectively we are mostly way above average in knowledge when it comes to some areas including interest in history and maps. But I suspect, from dealing with thousands of people over the past few years that geographic knowledge is not very great even among genealogists.  That said, let's get down to business.


Here is a screen shot of a typical topographical map:



By definition a topographic map is planar representation of the natural and cultural features on the world's surface. For genealogists, topographical maps are a rich resource of information about the localities where our ancestors lived and where they moved and traveled. 

Usually, a topographical map will give you information about 
  • roads, buildings, urban development, political and natural boundaries, railways, power and transmission lines
  • water features including lakes, rivers, streams and swamps
  • relief features including mountains, valleys, slopes, caves and sinks
  • types of vegetation including forests, cultivated land and irrigation
  • place names
For genealogists, topographical maps include cemeteries, schools, churches, government buildings, and many other useful references. You haven't really been somewhere, even if you go there in person, unless you have seen a detailed map of the area. It is too easy to overlook or be unaware of what you are looking at without a map reference to give you the perspective of the area.


The colors on a topographical map are significant. Black shows cultural (i.e. man-made) features.
Red is  used for paved roads and the accompanying symbols and identifications. Orange indicates unpaved roads and unnamed roads and streets. Brown is used for contour lines, elevations, and to show sand and eskers. Blue represents water features. The names of water features are also shown in blue. Green is used for wooded areas, orchards and vineyards. Grey is used to show map information and legends. Sometimes purple is added to show added information such as updates. 


Topographical maps are different than an ordinary road maps because they are physically located on the earth by latitude and longitude and also corresponds to the Universal Transverse Mercator coordinate system. Translated into English this means that each individual map is at the same scale as adjoining maps and can be physically located on the earth's surface. 


One of the most prominent features on topographic maps are the contour lines. These lines represent physical features on the ground that have the same elevation measured in height above sea level measured in feet or meters. In making a topographical map, the map maker (cartographer) surveys the land both in distances and altitude. Contour lines connect points of equal elevation. Each map has a standard interval for contour lines. So that the slope of the land can be represented by the spacing of the lines. Contour lines that are close together show a steep change in grade, lines further apart show little or no slope. For example, if the contour interval were 40 feet, there would be a 40 foot difference in elevation between two different contour lines. 


This is just a start to a series on maps. Stay tuned for the next installments.




Friday, February 25, 2011

A sense of history

In researching a friend's ancestor today, I realized that the ancestor may have fought (or at least served) in the War of 1812. That reminded me of when I was in high school (yes, they did have high schools that long ago), We all had to take history classes. But in the World History class, we never seemed to get much further along during the year, than the Roman Empire. In American History, it seemed like the year ended in the U.S. Civil War. I think our books might have had a page or two about some of the other wars and maybe I missed the classes on the days they talked about 20th Century U.S. history. I understand that now the students are lucky to get one history class out of four years of school. However, this post is not all just hand-wringing over the lack of history education in the schools, it is really about what the lack of history education has done to genealogy.

Some of the more developed lineage linked database programs available today, have he ability to generate a timeline, putting every person into the context of world and local events. But even if the programs will generate such a useful tool, there is nothing that motivates people to use the feature and what is probably more the case, the users have no idea of the significance of all those dates anyway. Here is a little test to see what you know about U.S. history. Those of you who are reading this in different parts of the world will just have to be patient. I am sure that knowledge about English, Scottish, Australian or New Zealand history is just a dismally lacking. The following is a list of fourteen wars that settlers to North America could have participated in. They are not in order. Try not to peek, at the end of the post, I have put them in the right chronological order. Some of these wars were also part of larger conflicts that involved European countries.

The French and Indian War
Queen Anne's War
World War II
King George's War
American Revolutionary War
Dunmore's War
Mexican War of Independence
Northwest Indian War
Whiskey Rebellion
The American Civil War
War of 1812
Black Hawk War
World War I
The Spanish American War

Now, here is the harder question, who fought each war and why? The point of this is simple: GENEALOGY IS HISTORY AND ALL HISTORY IS GENEALOGY. Not knowing that the history exists is the main problem, not just not knowing when the wars were fought. Computers were invented so I didn't have to memorize wars and dates. But all the computers in the world are not going to help me do my family history if I ignore the history part. I can always get by with one or two generations, especially if they all lived in the 20th Century, but what about the past.

What is an even more disturbing issue is that very few young people have been given the chance to learn and to love history. If they are going to do "their genealogy" because they have a good background in video games and cell phones, how does that help them know history.

OK so this is a rant. Sorry.

Now here's the list with dates. If you want to check out a more complete list of wars, go to List of Conflicts in North America.
1702-1713 Queen Anne's War
1744-1748 King George's War
1754-1763 The French and Indian War
1774 Dunmore's War
1775-1783 American Revolutionary War
1785 - 1795 Northwest Indian War

1794 Whiskey Rebellion
1810-1821 Mexican War of Independence
1812-1814 War of 1812
1832 Black Hawk War

1861-1865 The American Civil War
1898 The Spanish American War
1914-1918 World War I
1939-1945 World War II

Thursday, February 24, 2011

Genealogy Inc. v. Genealogists -- Part Two, What is in a name?

In my last post I began a test to see what someone with little or no information about their family could reasonably expect to find out from the commercial genealogy websites. As I indicated, I started by putting my own name and birth information into Family Tree Maker (FTM). I used FTM because it will automatically link into Ancestry.com's huge database. I figured I would pretend I watched the TV program and bought the program at Costco and came home, put in my name and was off to the races.

As I frequently say, Hmmm. I sure didn't get much. No one bothered to tell me (I am playing the part of the novice now) that my name wouldn't be in the program. I did find one reference to me in a directory right at first but couldn't find it again. A search on my name and birth info (I am not dead yet) returned 334,611 records and as they say, "Sorted by Relevance." Well, relevant or not, none of them are me. There are also 463 possible living people that match my name in MyLife. When I click on that link, I come up will all sorts of people. This might not be as easy as the ads would make it seem.

I try clicking on a few categories and still have no success. Further clicking gets me further from the goal. I come up with all sorts of death and burial records. At this point in the test I am pretending to make a phone call to my genealogist friend who informs me that genealogy is for dead people, which I have begun to suspect in more ways than one. So I decide to try looking for my parents.

The search for my father begins with Ancestry.com again. I am still using Family Tree Maker as the front end for the program. Just to be a little fair, I put in his death date along with his name. Although I am already starting to erode my resolve to be a novice. I frequently deal with people who do not know any dates from their parents. But here we go with a search on his name and one date.

The jackpot, a listing in the Social Security Death Index (SSDI), give him a birth date and a social security number. Reading a whole lot of information about the SSDI, I find that I can order the original of his social security application form from the Social Security Administration which form is called SS-5. (Side note, it does give me a link to the Social Security Administration website but doesn't seem to mention that I have to use Form SSA-711 to request the information). The program also crashes when I click on the link and try to look for the right form to fill out. Oh well, I will push on.

I merge the information from the SSDI into my local program and try another search with the new information. I find a number of references to City Directories. Then I find the 1930 U.S. Census record. Suddenly, I have grandparents. Now I am getting interested. (Still pretending folks). I also find his WWII Enlistment Record. Within a few minutes, I also find my Grandfather, Grandmother and an Uncle.

Now I want to see if I can extend the line one more generation. Very quickly I find my Grandfather in the 1900 U.S. Census and now have a whole list of family members. I also find suggestions for my Grandfather of more Census records and a WWI Draft Record. Adding my Great-grandfather into my file gives me the option of looking for more information. At this point, I have way too much to digest. I have more records than I can look at right now. I chose the 1900 U.S. Census and suddenly have a whole family of relatives.

I could have gone on and on with my original goal of looking for family members in all of the big online commercial databases, but I think my test is over.

What did I find? Although it is not as easy as portrayed in the ads, finding information about U.S. citizens in the last 100 years or so is very doable on Ancestry.com. In theory and in practice, you could end up with a whole lot of information obtained from fairly reliable online original sources. You also end up with copies of all of the documents, such as the Census Records, Draft Registration Records and so forth.  Personally, I have already been through all of these records and was not surprised, but I was surprised at how quickly the evidence began to mount.

To be entirely fair, I have been using FTM off and on for the past couple of weeks to look for a friend's family. I am searching for a man born in the about 1782 in possibly Ohio. Believe me, the records do not just fall into my lap. I have found a few things online so far, but it takes real work and more than a little bit of experience to ferret out the records.

My conclusion, don't be too hard on the commercial sites. They by and large do a very good job of making huge amounts of data available that would be much harder to find. Are they worth the cost? If you don't think so, you probably haven't got research in the areas they cover. Could I do what I just did in constructing three generations of my family any other way out of original source documents? Not likely. I could get the information out of online family trees but there would be no sources and the information would be very unreliable.  Good luck looking at commercial sites.

Genealogy Inc v. Genealogists -- Part One the Introduction

Genealogy has become a big business. There are thousands of small mom and pop enterprises but the genealogy landscape is becoming dominated by BIG BUSINESS in a big way. Is the pre-packaged commercial view of genealogy consistent with the reality of doing family history? Can I really go to a website and find my ancestry?

These questions prompted me to do an evaluation of the exactly what kinds of records are offered by online commercial genealogy companies and whether or not I could actually compile a "family tree" from online sources. For the purpose of this analysis, I decided to exclude "free" online source and focus on the main subscription services. I also decided to use my first four generations as a test case. I did this for two reasons, one, I have reasonably documented all of the information in those first four generations and, second, I would not be looking for people who were unknown or who had little or no documentation on their lives. As much as is possible, I determined to approach this issue as if I had almost no information. It is relatively common for me to be approached by someone who knows little more than their parents' names. In effect, I am trying to put myself in the position of a complete novice.

I decided I would put my name in a new file and the names of my parents without any information on birth dates or places. Both of my parents are deceased so there is no issue with obtaining information on living people.

Another issue is the fact that I know I can get a copy of a birth certificate from my State Department of Health and Vital Records some of which are online, but rather than use the information I know is readily available, I wanted to see what kind of picture I could get just from the commercial sites. For this exercise, and in order to expedite the process, I decided to use my Macintosh version of Family Tree Maker to start the inquiry. I will focus on the following commercial websites:


•19th Century U.S. Newspapers
•Alexander Street Press - The American Civil War
•Ancestry.com
•FamilyHistoryLink
•Footnote.com
•The Genealogist
•Godfrey Memorial Library
•Heritage Quest Online
•Making of the Modern World
•NewEnglandAncestors.org
•Sabin America
•Supreme Court Records and Briefs
•World Vital Records

These are some of the subscription resources available at the Mesa Regional Family History Center so I can readily consult all of them for this exercise. One thing I had to think about carefully was whether or not to use online compiled user submitted family trees. If I did that, I know that I could just download my entire four generations from Ancestry.com's Family Trees. I also have the information readily available from FamilySearch in several different formats. I know it is ridiculous to ignore these other online resources in a real situation, but what I wanted to determine is whether or not a reasonable pedigree could be compiled from only commercial websites? What do they really have for the average person?

One more factor, my family, at least in the first four generations, is completely from the U.S. My first immigrant ancestors are in the fifth and sixth generations. So, as they say, your results might vary if you were to try this same experiment. For that reason, I am not consulting some of the more common British and European sources.

Why am I doing this? Advertisements for genealogical databases have a tendency to minimize both the time and resources needed to produce a credible genealogy of anyone. Genealogy is basically very difficult to do. It takes time, effort and persistence, especially if you are starting from scratch and not tagging onto the coattails of a grandmother or aunt. I am curious as to how this study will turn out. What will I be able to find? Maybe the claims are all correct and I will find my ancestors in the records maintained by the commercial enterprises. I will wait to see what happens before drawing any conclusions.

Backup? Archive? What should I do?

If you haven't lost data, you are either obsessive and compulsive or haven't spent much time on a computer. The facts of computer life include having programs crash, losing power through blackouts, losing your flash drive, and almost an infinite number of other ways to lose your work. Genealogists are not immune to losing data. Lately, I have been thinking a lot about data preservation. The larger your database the more you tend to think about what would happen if it all crashed.

There are two different terms that are usually mentioned when anyone talks about data preservation; backing up your data and archiving your data. Although both of these concepts involve similar activities, there are some significantly different conceptual implications between the two terms.

Backing up information implies making a copy of a working file so that in the event of a power outage, disk crash or other catastrophe, the file will still exist in some form or another. It seems obvious, but a copy of a file does not become a backup unless it is disassociated with the original so that if the original is destroyed, the copy is not also destroyed. Let me put that another way, a backup only becomes a real backup (no matter what you think or what it is called) if the data is on a different and not dependent media. If I am working on a file and the program stops me periodically and says something like "backing up your data" that statement is only true if the copy being made goes onto an entirely separate storage media, such as an external hard disk or a flash drive. In this sense, a backup by definition is different than a program's file saving function. For example, if I am working in Microsoft Word and have selected automatic backup, the program will automatically save my work as I enter data. If my program crashes or the power goes off, I can recover the document I was working on. But in this case and all others, if the computer hard drive crashes or is stolen everything on the drive may be lost. The fact that an individual program has "backed up" my file is meaningless.

To repeat, a true back up of a file or files has to exist independently of the original. So you will need to have some kind of external storage device to receive the backed up files. What would happen if your house burned down? In order to be even further protected from loss, a backup copy should be not only on a different device than the original, but in a different location. There are any number of media that can be used to backup a file; external hard drives, flash drives, tape backup, DVDs, CDs and a few others. Each has its limitations and each has its merits. But once the copy is made, the copy should be stored in an offsite location such as the house of a family member or friend.

So what is an archive copy and how is an archive different than a backup? Physically they may be the same but the difference is the intent. If I am going to archive a file, I do not intend to use the media repeatedly. Once I have verified that my archive copy is good, I will put the archived copy in a very safe location and leave it there to remain undisturbed. Archived records are usually those of a historical or reference nature. So how would I make an archived copy differently than a backup? You wouldn't. But you might use and reuse the backup several times, but you would only use the archived copy if the original of any were no longer available. You would then make a copy from the archive and return the archive to storage.

The static concept of an archive is fine for physical files, books and such, but, there is a huge exception for electronically stored data. Anything stored in a digital format can be lost over time. In every case the data must be migrated (moved) to a newer media or format frequently to make sure the information is still readable. All electronically stored data should be updated whenever there is a change in operating systems or whenever there is a software upgrade. So the difference between a "backup" copy and an "archive" copy is somewhat blurred. I make the distinction by replacing the "archive" copy with new migrated media periodically. Over the years, I keep dragging my data along the preservation path, one new hard drive and one new program at a time.

It is great if you store your old genealogical database files in a safe place, but every time you upgrade your program, you also need to upgrade your stored files. By the way, this is a lot easier to say than it is to do consistently. Think about making regular backup copies of all your work, but also consider adding an additional archive file to the mix to make sure the first copy does not also go bad.

Wednesday, February 23, 2011

Updating GEDCOM -- Three different groups working?

Since the RootsTech Conference in Salt Lake City recently, at least three separate groups have surfaced that appear to be working on a new GEDCOM standard. It also appears that they are each approaching the issue from an entirely different perspective. From the online sources, it is difficult to determine if the different groups/organizations were completely aware of each other's efforts. Here is the present lineup:

BetterGEDCOM
This is a formally organized effort based on a site in WikiSpaces. It has been operating for some time and I have reported on some of the issues involved in previous posts. Quoting from a statement on the site:
BetterGEDCOM is an independent user community formed to develop internationally recognized genealogical technology standards for the benefit of the entire genealogy community. BetterGEDCOM has no affiliation with any commercial entity or any other particular genealogical organization but welcomes the participation of all interested parties.

BetterGEDCOM is a place to discuss and work out how to solve technological problems in genealogy and how software programs and services can best interoperate. We are concerned with genealogical technology issues broadly, but initially we will develop a GEDCOM update/replacement.
FamilySearch:
From different postings on the Internet and in the FamilySearch Research Wiki, it appears that FamilySearch has definitely begun its own effort to establish a new data transfer standard, whether or not this could be viewed as an extension of GEDCOM or a replacement remains to be seen. Here are some of the links that give an insight into the discussions going on:

http://www.genealogymedia.com/2011/02/14/rootstech-2011-data-model/  This is a report of an open discussion session at the RootsTech Conference. Quoting from Jordan Jones of GenealogyMedia.com:
It was at about this time that Tom Creighton, the CTO of FamilySearch, got up and announced that FamilySearch is nearly ready to announce a new pro­posed data model. This changed the meet­ing imme­di­ately. Instead of an open dis­cus­sion, it became more like a press con­fer­ence, with Tom field­ing ques­tions about what they have done, when the work will be shared, and so on. There was not a lot that he was able to divulge at this point.
 This statement is consistent with the mention made by Craig Miller in the Devotional Meeting at RootsTech that FamilySearch was working on updating GEDCOM. However, in answer to another question about the future of Personal Ancestral File, I remember the comment being made about the fact that FamilySearch was moving towards a "browser" based system. I would guess that rather than try to establish a new GEDCOM model they may be working on developing a more robust and universal API model.

Here are some links to some more discussion:

https://wiki.familysearch.org/en/Genealogical_Data_Standards_(RootsTech_Session)

http://rootstech.wikispaces.com/message/view/Open+Interactive+-+Standards/34538884

This is from Notes from an open discussion moderated by the Ancestry Insider, held at the RootsTech 2011 Conference.

Just a reminder from me, you might as well get used to looking at the FamilySearch Research Wiki, it is starting to vacuum up everything about genealogy.

You may also note that there is a new Wiki on Wikispaces called RootsTech. You may wish to check this out or participate. 

OpenGen.org:
Quoting from their website:
In response to this sea change, the International OpenGen Alliance was formed to bring together the most agile and active minds in the industry in service of creating a universal standard through which the full conversation of family histories could be shared and preserved.

The success of the world's genealogical societies and websites has illuminated the importance of a single standard. The public is now fully engaged in the discourse of family history. The time is now to measure the means through which this rich data discourse finds its place in each family history.
OpenGen has been at the task for some time and apparently is well into developing a sharing model.

Interesting developments. More as I find more.

Tuesday, February 22, 2011

The Trap of Failing to Migrate

No, this post isn't about birds getting stuck in the frozen north, it is about keeping up with technological changes. In the last few days I have been contacted by two people with genealogical data on 3.5 inch floppy disks. They were both trying to preserve work done years before and move it to a more current format. As for my 3.5 floppy disks, my wife decided they were all out of date and dumped the hundreds I had left. So far, I haven't missed anything, so I suppose that was a valid housekeeping move.

Data migration is one of those strange topics that seem to lack an audience. From the comments I receive to my posts, I can only assume that the vast majority of my readers are other Bloggers around the world. I can also safely assume that anyone who knows enough about computers to read my blog posts, probably has already had their bad experience with lost data due to technological obsolescence and doesn't need this reminder. But here is the reminder any way-- MIGRATE YOUR DATA.

There are really a number of challenges to maintaining data integrity and preserving the information; they include physical limitations of various data storage options, software changes and the fact that preserved data may simply become inaccessible through changes in hardware storage. For example, how many of you still have a few old iomega zip drive cartridges laying around in drawers or on shelves. Have you looked at the connector on your old zip drive lately? Guess what? My new computer doesn't have that type of connector. But I could buy an adapter. Likewise for my old SCSI drives, I just might get them to work with my new computer if I buy a SCSI to USB adapter. As a matter of fact, there is also a USB 3.5" Floppy Disk Drive available from TEAC or even from Walmart.

Does this all mean that the problem of data migration is a bugaboo? Not at all. There is absolutely no guarantee that even the adapters will work with future hardware and software. The data migration problem is real and can affect anyone with any type of software and hardware combination.

In the case of the specific inquiries I mentioned above, these old files on 3.5 floppy disks were likely old Personal Ancestral File (PAF) versions also. By the way, here is a detailed description of how to migrate older versions of PAF files on newer computers from the Silicon Valley Computer Genealogy Group. There is one key phrase in the instructions on migrating PAF, "The option you choose may be limited by which versions of the PAF software you have available on your computer." What if you don't have old versions of PAF on your computer? Are you dead in the water? Well, actually not. PAF versions are still available on the Internet. As a matter of fact the Ancestral Quest genealogy program will open most of the older versions of PAF without any difficulty at all. You only have to find some way of getting the file onto an computer with the Ancestral Quest program. As for me, I still have PAF on my computer and also have the latest version of Ancestral Quest.

The real data migration problem is just beginning. We are using more and more online programs. Many people have their data locked up in online family trees. As long as you keep your computer system up to date and buy a new computer every once in a while, you will likely keep moving your data on to the newer programs and computer. But what happens to online data? Who will assure you that changes in the operating systems and programs won't result in the loss of compatibility with the data online? No one. I have the same question about so-called lifetime guarantees. Whose lifetime? I like one computer guarantee that I saw recently that said the product was guaranteed until the company decided to quit supporting it. The solution is always maintain a copy of your data in a genealogy program on your computer. If you want to share online, that's great. But there is no substitute (yet) for having your data under your own control.

What do I suggest? Every time there is a software upgrade, I upgrade. I then make sure the new program still recognizes my old files. As I have mentioned in previous posts, I have spent literally days at a time converting old files into newer formats. Although the process reminds me of how many old data formats (like old 78 rpm records) I have had to deal with in the past.

OK, so what can you do to minimize data loss.

1. Keep upgrading your backup systems. Buy new hard drives, watch for newer technology and keep moving your data.

2. Keep upgrading your software. Don't forget your old files, save them in the new software format while the programs still recognize the old formats.

3. Buy a new computer system every so often. Watch and wait until there have been substantial microprocessor upgrades and then make the change.

4. Don't leave valuable data in old boxes on shelves in closets. If it is worth having it is worth preserving. Move the data to your new computers and new storage devices.

Good Luck. You might need it.

Monday, February 21, 2011

Comments on Monday Mailbox: Bulk Merge

As usual, the Ancestry Insider (AI) hits another home run (or sinks a three-pointer whichever) with his short post on the New.FamilySearch.org (NFS) data issues, called Monday Mailbox: Bulk Merge. If you read the post, be sure to read the comments. But as is usual with me, I cannot let this opportunity go by without also commenting on the subject of the post.

First, I feel I need to clarify the statement, "FamilySearch seeded the tree with bad data, some from computer merging, some from human error." As I understand what happened, the bad data AI refers to is the conglomeration of the Ancestral File, the International Genealogical Index (IGI), the Pedigree Resource File (PRF), the general membership records of The Church of Jesus Christ of Latter-day Saints and the Church's Temple records. As a result, right from the start, NFS had an insurmountable problem, inconsistencies between the different copies of the input data and multiple copies of the same individual and family records. For example, as it exists today, the Ancestral File contains a copy of a record of my Great-grandfather and the IGI contains more than 30 copies of the same information (with substantial inaccurate variations), and who knows how many duplicate copies in the PRF. This is in addition to the Church and Temple records. So Henry Martin Tanner, my Great-grandfather, has 115 combined records in NFS and probably quite a few more uncombined records. This is commonly known as the "data challenge" of NFS. This is also what AI is talking about when he says that FamilySearch "opted to keep the bad data..." I understand him to mean that FamilySearch has decided not to purge the NFS data of multiple copies with the unreliable entries but build a method by which users (you, me and etc.) can "clean up the data."

I personally would clean up the data by throwing away (erasing, deleting, isolating) the inaccurate data and leaving only the "one true data" about any individual and family. Guess what? There is the remote (though distinct) possibility that some of my extended family members may disagree with my selection of the one true data. Then what? Hmm. Does anyone out there recognize this issue from working with a wiki? The problem faced by NFS is exactly the reason that a static online genealogy database will never be satisfyingly accurate. It is also the reason that wikis exist.

Can FamilySearch turn NFS into a wiki? Not even remotely possible. Remember what I said above, that the data added to NFS contained "membership information." This information could never be subject to user change, any more than the program will now allow the combination of this information in the present system. (If you were not aware, NFS allows users to combine duplicate individuals, except when the duplicate involves two or more duplicate membership records). Then the correction has to be made through the Church organization outside of the NFS program.

So what is meant by AI's statement that the replacement system will allow users to "clean up the data?" That is the Question (with a capital Q). How will the new (we keep using the word "new" over and over until it doesn't mean what you think it means i.e. Princess Bride) program handle new (here we go again) information that is really bad? For example, what if one of my relatives wants to show my Grandfather with his second wife as his mother?  (Who would do such a thing? Just take a look at my lines in NFS, that is exactly what someone has done). How will the program take into account lunacy?

How will the program prevent many more of my relatives from doing similar things in the future? Is the cost of liberty (from bad data) going to be eternal vigilance? Will I have to go back to the program every week and clean up the mess? Yes, as AI says "Once again we see evidence that genealogy is deceptively difficult."

Sunday, February 20, 2011

Genealogist's View -- Speech Recognition Revisited

As I mentioned in my last post on the subject of speech recognition, I was motivated by Anne Roach at RootsTech to return to the issue of speech recognition. Since my last go around, there has been considerable increase in computer speed and storage capabilities. Hoping, that these improvements would also show an improvement in speech recognition, I decided to try it again.

I must say that the results so far are mixed. It does appear that the program recognizes the vast majority of my words and speech patterns without any difficulty. Obviously, the increase in computer speed has a direct effect on the ability of the programs] to accurately represent what is spoken. But I will have to admit that the program is still cranky and makes some of the same errors that it made previously. The real issue is whether or not speech recognition is more efficient than using the keyboard. In the case of genealogy, there is obviously an issue as to whether or not the programs can recognize names efficiently.

Another issue is the question of whether or not having to stop and recalibrate program periodically actually saves anytime. For the record, I am using DragonDictate, the Macintosh program that utilizes the Dragon NaturallySpeaking software. I had previously used Dragon NaturalySpeaking on a PC. During the last few days, I have consistently tried to use the program in a variety of circumstances in order to determine whether or not it is an effective way of speeding up my text entering. So far, I have used the program to dictate my posts using Blogger, e-mail using Thunderbird, the Mozilla software program, OpenOffice and several other programs. I am presently still undecided as to whether or not there is any actual gain in productivity.

For example, here is a list of ten names as copied from my genealogy:

Samuel Shepherd
Susanna Dexter
James Newton
Maria
Ann Kadwale
Mary Mitchell
William Tarbutt
Andreas Jensen
Jens Jorgensen
Niels Pedersen

 I keep these names into the post. Following, is the same list of names read into the post by the program:

Samuel Sheppard
Susanna Dexter
James Newton
Maria
and Well
Mary Mitchell
William carpet
Andreas Jensen
Jens Jorgensen
Niels Patterson

I guess you could say that the results were either pretty good, or pretty bad depending on how much correction work you want to do. Only six of the names were transcribed correctly. The main problem that I see is the variation of Shepherd and Sheppard. These variations in spelling may be entirely confusing if they were not caught during proof reading. It is also apparent that the more unusual name Ann Kadwale  may be too difficult for the program. I can hardly see myself spending hours teaching the program the thousands of names in my database.

Here is another example of the difficulties faced using speech recognition. The following is a selection of text copied and pasted from photo.net explaining Digital Camera Basics:
Digital cameras are confusing to a lot of new users. In this basic guide to digital camera technology we hope to try to give digital beginners at least some basis to use in deciding which digital camera is appropriate for them. When shopping for a digital camera it's at least good to know what the basic terms like white balance, pixel, ppi and dpi mean and how they affect image and print quality. It's also important to know the difference between things like optical zoom and digital zoom as well as the advantages and disadvantages between storage formats such as Compact Flash (CF), Microdrives, Sony Memory Stick, Secure Digital (SD), Multimedia and camera interface technologies such as USB 1.1, USB 2.0 and Firewire IEEE 1394.
 Here is the same paragraph dictated using speech recognition software:
 digital cameras are confusing to a lot of new users. In this basic guide to digital camera technology we hope to try to give digital beginners at least some basis to use in deciding which digital camera is appropriate for them. When shopping for a digital camera it's at least good to know what the basic terms like white balance. Pixel, PPI and DPI mean and how they affect image and print quality. It is also important to know the difference between things like optical zoom and digital zoom as well as the advantages and disadvantages between storage formats such as CompactFlash (C F), Micro drives, Sony Memory Stick, Secure Digital (It S D close parens, Multimedia and camera interface technologies such as U S B1 .1, U S B2 .0 and FireWire India E E.co 1394.
That is exactly how it came out without any editing. Again, the text is either pretty good or pretty bad depending on how much editing you want to do. I recognize that if I were more familiar with the commands, many of the "errors" could have been corrected in the dictation. If I were to key in the paragraph, I would probably proof my typing as I went along. There is no way I could have typed the paragraph as fast as I could read it but any gain in speed would seem to be lost in time spent correcting the errors.

So, the question arises as to whether or not a combination of speech recognition and typing is appropriate? Unfortunately, the speech recognition program does not recognize keyed in text as part of the text in its memory. So you could not go back and use speech recognition commands to correct the text if you had keyed part of the text and done another part through dictation.

I am not giving up yet, but I suspect that I will reach some of the same conclusions that I reached previously, that speech recognition may save some time entering data but that the time savings may be lost in editing. Stay tuned as I continue to use the program, learn more of what the commands can do and do additional training of the program.

Saturday, February 19, 2011

Who do they think we are?

One of my sons is reticent to tell people what he does or studies because when he does so, the conversation inevitably ends. He is an astro-physicist. Being over involved in genealogy as I have a tendency to be, I find the same conversation stopper. Just mention genealogy to a non-genealogist and you will get a look like you just admitted you were out on parole. One of the most satisfying parts of attending a convention, especially a genealogical convention is the ability to talk to someone who tolerates genealogy.

Many years ago, I retired from the practice of law (for the first time) to run a computer retail store and a software company. One of the most satisfying parts of retiring from law was the fact that I didn't have to tell anyone I was a lawyer. For years, when asked what I did for a living, I could truthfully answer that I was involved in computer businesses. Over the years people began to forget that I was an attorney. New friends and acquaintances didn't know. When I went back to practicing law, I was once again forced to admit that I was an attorney. Some of my newer friends were perplexed and would always ask, "When did you find time to go to law school?" assuming I had just begun the practice.

Another fall out of returning to law was the common retort, "Since you are a lawyer, can I ask you a question?" This would always be followed with a detailed account of some legal mess with an expectation that I would solve the problem for them in 30 seconds for free! I was asked that question today, for an example of the almost automatic reaction to my disclosure that I am an attorney.

Guess what? When I tell people I am a genealogist I don't get any questions. I don't think that the blank stare and shudder is really a question, it is more of a reaction. Part of the reason, I am sure for the reaction to genealogy stems from the common viewpoint that the whole subject is the purview of slightly (or more than slightly) balmy older people who have failed at shuffleboard. In the greater Mormon or LDS community of which I am a part (that is members of The Church of Jesus Christ of Latter-day Saints) genealogy has a basic religious foundation. Members of the Church are frequently exhorted to "do their genealogy." Conversations with other members of the Church which turn to genealogy generally end with a guilty explanation of why the other person doesn't (fill in the blank, i.e. have time, have the ability) do genealogy. One good thing, however, they seldom ask for a free answer to a difficult question.

I could go on with the reaction of family members to my avocation. Those who are still talking to me, will sometimes listen politely but often suddenly decide to change a baby's diaper or something else a lot more appealing than genealogy. 

How do I view myself as a genealogist? Who am I? I am first and foremost a researcher. Some people enjoy sports, dancing, music, whatever, I enjoy research. (I also like music and a lot of other things just for the record). I enjoy accumulating facts, evidence and following a line of proof. I love libraries. I could spend the rest of my life doing research. Now down to reality. One of the few things I like about law is the research and writing. Second, I love to teach. I like to see people light up when they finally understand a difficult concept. There are many parts of professional teaching that I find onerous and distasteful. Mostly I don't like teaching people who don't want to learn. I like to teach genealogy because no one comes to a genealogy class because they have to. One of the worst parts of teaching, was teaching continuing education classes to attorneys. None of them had the slightest interest in being in the class, they were there because they were forced to be there. When I taught Spanish, I taught at 6:00 am because I knew that only people who wanted to be there would come to a Spanish class at 6:00 am.

I like to find out new things. If I had lived in the distant past, I would have been an explorer. I always want to go over the next hill just to see what is there, even it what is there is just the same as what is here. I found that sense of discovery and wonder in libraries and I continually find it in genealogical research. People and families are infinitely different and each has a story to tell. Discovering new facts and new stories is fascinating and inspiring.

I am a compulsive writer and talker. Sitting down in front of a blank computer screen is an invitation to write. Unless I am too sick or so tired I cannot function, ideas come to me in waves. I wake up with ideas. If I wake in the middle of the night sometimes I cannot sleep until I have written down my ideas.

Like all genealogists, I am a collector. I have collected stamps, coins, matchbook covers, animal figures, cameras, and most of all, pervasively documents, photos, and information of all kinds. I also have a hard time throwing away anything that appears even remotely useful.

I am also blessed (or cursed) with a encyclopedic memory. I often become embarrassed when I start to explain something to someone and must remember to stop before their eyes glaze over and they pass out from information overload.

You may wonder who you think you are, I wonder who they think I am.

New tools for cleaning up data from New FamilySearch?

New.FamilySearch.org seemed to stay quietly in the background during the entire RootsTech Conference. You would've had to have been listening quite closely to detect more than passing mentions of the program. The program was during the Devotional with Elder Richard G. Scott which included a question-and-answer session. Craig Miller of FamilySearch was very particular that future developments in New FamilySearch would include the ability to edit information and to mark preferential family lines.

Presently, the program has no way of marking data which is questionable. This is especially true since the discontinuance of the dispute function. If you disagree with the entries accuracy you can begin a new discussion under the Discussion tab, but there is really nothing in the program that allows you to make changes or to indicate that the information is incorrect unless someone happens to read your discussion. There is also no external notification that a discussion has been started.

In the meantime, while we wait for the changes to come to New FamilySearch, any movement towards recognizing inaccurate, incomplete, or misplaced data would be much appreciated by some of the user (probably not those who don't recognize the problems).

In the meantime, a glimmer of relief came from an entirely different direction. Another class at RootsTech on updates from Legacy Family Tree by Goeff Rasmussen, caused me to take the time to check out the new update to Legacy Family Tree, which is even newer that the last new update that I wrote about. This post like the last one focuses on the update to the interface between New FamilySearch and Legacy Family  Tree.

Legacy introduced the initial part of their interface with New FamilySearch last year and the updates this year show that they spent time looking at the other implementations of the interface and learned from their competitors. One of the first things that I noticed, having worked with the other programs, is that Legacy keeps a copy of your login and password so that you do not have to  keep entering the information over and over again. This may be a small point, but it avoids an annoying problem common with the other synchronization programs. It is my understanding, that the New FamilySearch program requires the re-entering of the login and password in order to preserve the security of the program. Legacy apparently keeps the record of the login and password locally in order to avoid the security risk but also avoids the repetitious re-entering of the information.

Legacy has added an evaluation function that is not as yet available in any of the other programs. Questionable data in the New FamilySearch program is marked or flagged in Legacy. Holding your mouse cursor over the flag gives you a brief explanation of what might be wrong with the data. Although the other programs that have developed synchronization to New FamilySearch show the duplications in the data, all of the duplications are shown with the same confidence level.

I found some of the flagged exceptions show in Legacy to be trivial. But the important point made is that there are real issues with some of the data in New FamilySearch and for the first time a program is attempting to alert an unsophisticated user to a problem. Even with these flags or markers, there are still major issues with the New FamilySearch data not identified or marked by Legacy. For example, New FamilySearch has six different christening dates for my Great-grandfather. The issue isn't the dates, it is the fact that in The Church of Jesus Christ of Latter-day Saints, of which my Great-grandfather was a member, does not have an ordinance or function which could be called a "Christening." The Church does have an ordinance called Baby Blessing, but this is definitely not the same thing as a christening. This inappropriate data comes from the unfortunate fact that the original, now old, Personal Ancestral File program had a space for entering the christening data for every person and some people felt compelled to fill the empty space with a date and place without knowing what they were doing.

I certainly hope progress is made during the coming year towards correcting the data in New FamilySearch and I will be ready to report any and all changes.

Friday, February 18, 2011

Bound to Repeat

George Santayana, who, in his Reason in Common Sense, The Life of Reason, Vol.1, wrote "Those who cannot remember the past are condemned to repeat it" could have been speaking about genealogy. During the time that I was reviewing my Great-grandmother's genealogical research, I found that she had repeated her research, over her lifetime, at least three times. She was a careful researcher and had extensive notes but the sheer numbers of people that she encountered in a research overcame her ability to keep track of what she had already researched. Although she lived a long time before computers became available, I firmly believe that she would've embraced them with enthusiasm. However, she lived a very hard life and work nights cleaning offices to support herself on a meager income and died long before computers became available.

Although today we have tremendous technological advantages and opportunities to support our research, we are still faced with the challenge of repeating research unless we keep adequate records. Yesterday, at the Mesa Family History Center I was once again reminded of this fact when I spoke briefly with one of my friends who was sitting at a table sorting documents and trying to figure out what she had done in the past. Upon talking to her, I determined rather quickly that she had no genealogical database program, not even Personal Ancestral File. She was relying entirely upon handwritten notes, family group records and miscellaneous copies of documents.

This may seem like an extreme case, but it is all too frequently found among genealogists who have ignored technology or simply not had the opportunity to become acquainted with what is available. In my friend's case, fortunately most of the information which he had originally compiled, was probably preserved online in the New FamilySearch program. But without reference to the online source and without using one of the easily available genealogical database programs, she would likely have to reproduce and repeat all of her early research efforts. I was on my way to teach a class, and could not spend a great deal of time helping her but assured her that she could contact me at any time for further help in getting her original work organized into a computer database program.

In a recent post, Randy Seaver of Genea-Musings divided the genealogical world into three categories: the traditional genealogy world, the online genealogy world, and the technology genealogy world. He estimates that 85% to 95% of all self-proclaimed genealogists still live in the traditional genealogy world. Although I may not agree with this high percentage, certainly agree that there are a large number of people involved in the genealogical process who are only vaguely aware of the technological resources available. It is also my experience that even technology savvy genealogists may be unaware of resources that are readily available on the Internet. But like my Great-grandmother, my friend, unless she takes the time and effort to computerize her investigations, may be doomed to repeat much of her previous work.

In contrast, to my friend, at the same time there were other patrons in the Mesa Family History Center, who were actively searching online and although they have been doing research for years, were finding new leads and  answering questions in their pedigrees after having had a minimal amount of orientation to current online resources. Technology appears to be a challenge based on background, education, personal inclination and aptitude. I would not go so far but Donald Lines Jacobus in his book Genealogy as pastime and profession included the following quote from William Bradford Browne, who states that genealogy is a science "which requires years of preparation, and is only successfully acquired by a person naturally adapted to its study, and to whom its drudgery is pleasure and not work. It means the power to read or decipher ancient records, to understand their meaning, to read them with the understanding of obsolete meanings. It means a knowledge of law, sufficient to understand the why and wherefore of papers of a legal nature...It requires the intimate knowledge of towns, counties and states, so that the genealogist knows where certain records are at stated periods, and how these towns and counties have been divided and at what times...It requires the knowledge of the changes of the calendar from the old arrangement to the new."

The issue of background and capabilities goes to whether genealogy is inclusive or exclusive. Mr. Jacobus and his friend Mr. Browne obviously feel that genealogy is exclusive, reserved to those who have the native capability to ingest and analyze documents. Contrary to that viewpoint was the one expressed during the Devotional (Question and Answer Period) at the recent RootsTech Conference that would decidedly inclusive of anyone having an interest in and concern for their ancestors no matter what their degree of sophistication.

I think that there is a mid-road where both points of view can be accepted. Genealogy truly is difficult. It requires a large measure of experience and ability to research original source documents. For example, there is no question that in order to read an old will or deed you must have experience reading the handwriting, deciphering the law and understanding the cultural conventions of the time in question. It is understandable the the novice will view this ability as either difficult to obtain or not necessary. But we are a nation of immigrants and those who aren't didn't speak English either and it is almost inevitable that by going back in history you will finally get to something foreign to your own experience. In genealogy we are all beginners if we keep doing research. The only researchers who can claim comfort are those who specialize and do work for other than their own lines.

I think I better quit on this topic for a while. I could go on and on. By the way, part of this post was entered by speech recognition software, can you tell which part?

Thursday, February 17, 2011

A return to speech recognition

At the recent RootsTech conference in Salt Lake City, Utah, I attended a class taught by Anne Roach of FamilySearch. Right off the bat, she inspired me to go back and take another look at speech recognition. Off and on over the years I have spent time trying to use speech recognition software to speed up my writing. Unfortunately, each time that I tried to use the software it turns out that the software was not capable of following my dictation without an unacceptable level of errors.

Because of the volume of writing I have been doing recently, I decided to give speech recognition another try. I researched the programs, looked at the reviews and decided to purchase one of the newer programs. I had previously used  Dragon NaturallySpeaking and found it to be a very good program except for the limitations of speech recognition in general. This time, I decided to try the Macintosh version. The program is called Dragon Dictate and is based on the Dragon NaturallySpeaking model.

As I had expected, after receiving the program, I remembered that learning speech recognition software was very similar to learning an entirely new language. In this case, there are 184 pages of commands and instructions. Most of the instructions are fairly simple to remember but it is a challenge to learn to distinguish between the commands and the text you are dictating.

All of the speech recognition programs require a training mode, where the user reads a selection of text and trains the program to recognize his or her voice. It is critical to the use of the program that this training be done to the extent that the program needs to learn the user's voice or the speech recognition will be faulty. After spending a short time training the program, I decided to launch right into trying to dictate a blog post.

To my pleasant surprise, the program seems to work very well and is recognizing my speech without too many problems. The main problems seem to center around mis-recognition of the words and my inability to remember all of the commands. I would expect that as I studied the program and learn the commands that my ability to dictate will speed up considerably.  What I also found previously, was that I had a tendency to use the mouse even though I was trying to dictate with the microphone. Learning the commands will probably eliminate nearly all of the need to use the mouse.

I also found that having a good quality USB microphone was absolutely essential to adequate dictation. Even though the program came with a microphone, I purchased a higher quality microphone almost immediately and noticed a marked increase in the capabilities of the program to accurately represent the text that I was dictating.

For individuals with handicaps or other challenges that prevent using a keyboard effectively, voice recognition would be an effective way to enter text into the computer. Any time you are using the dictation program, it is very important to watch the text carefully as it is entered by the program so that you can make corrections “on the fly.” It is apparent from the dictation which I have all already made, I will have to go back and do a significant amount of proofreading, which is not one of my strong points.

It is certainly apparent that speech recognition has come a long way since the first demonstration that I saw at the world's fair in Seattle in 1962. However, I don't think that the speech recognition program will help me think or do research any faster, as it appears that this blog post took just about as long as it would have had I typed out the text rather than dictated it through the program. My goal is to continue to use the speech recognition software and from time to time, I will give updates on my impressions of how my use of the program coming along.

Backing up your genealogy files -- offsite storage

The term "backup" is high ambiguous. A number of genealogy programs, including the venerable Personal Ancestral File (PAF), have a file option called "backup" included in the menu. Although the term is used, choosing this option does not in fact make a backup copy of your data. In the case of PAF, the program makes a compressed copy of your file on the same disk as the original, unless you specifically direct the program otherwise. In my experience, those who use this command simply because they have been told to do so, rarely, if ever make a separate copy of their file on another device.

What is a backup copy of your file? As clearly as I can explain the concept, it is a separate copy of your data file on a completely separate disk or other media. When you create a new file in any program including genealogy programs, the program saves the new file to your hard drive. Unless you tell the program otherwise, the file is likely saved in some default location on your hard drive. Sometimes the programs save the file to a default "My Documents" file on Windows Operating Systems. Sometimes the file is saved into a file folder created and named by the program and located in the folder containing the copy of the program file. This is usually the case with Macintosh systems unless you have designated an alternative place for the file to be saved.

Many of the current genealogy programs have menu selections that allow you to designate where you would like your "backup" files to be recorded. However, unless you make this selection, the program will save files in whatever default location is set by the developer of the program.

This problem of where files are saved is not unique to genealogy programs. Normally, without intervention, all user created files, depending on the programs' defaults, will be saved in different folders on your hard drive. Word processing documents in one location. Genealogy files in another location. Photographs and other graphic files in yet another location. Before you can properly backup any part of your computer system, you need to get control of the location where your files are stored. It is very common for me to find genealogists who have several, sometimes many, copies of their data files all over their computer's file system. Often, they have no idea which of these duplicate files is the most recent and may have added data to different files at different times. Sometimes they are even surprised to find out they have duplicate files at all.

My method of solving this problem is to create one file folder, which I keep on my desktop level, where I save any and all my data file, from any and all my programs. This puts almost all of the data files in one place. There are a few exceptions with programs that will not allow their data files to be located outside of the same folder containing the program. If I cannot work around this problem, I make sure I am not using that particular program for the creation of any of my own primary data files, i.e. files I want to keep.

Backing up all of my data files then just involves making a copy of the singe data file (usually called "James" or whatever) onto an external hard drive. Now, in the case of my iMac computer, I use Time Machine which makes a complete copy of my hard drive many times a day. I use a 2 Terabyte back up external hard drive that can continue to keep the updated copies for many weeks and months back. In addition, I make another separate copy of the data file and anything else of value, on a different external hard drive. From time to time, I also make a backup copy onto a portable external hard drive and give the copy to one of my children somewhere around the country.

This last step is important. To have a true backup of you data, not only does the copy have to be separate from your main computer, it also needs to be offsite. If your house burns down or is broken into, you could still lose an onsite copy. Any time I do a significant amount of work, like hundreds of additional scans or a lot of writing, I also make copies onto separate computers in addition to the external hard drive backup. One of those is a laptop which I usually carry with me on trips. Just in case.

You could make a copy onto a flash drive. In my case, my files exceed the size of the largest currently available flash drive. But in your case, you may be able to use a flash drive. But I suggest also making a copy on an external hard drive and don't forget to store a recent copy with a friend or relative. (Good friend and good relative).

Some people suggest storing your data online. I am not really happy with that idea. In my case that type of storage would be relatively expensive in both money and time. I am not convinced that any one of the online storage companies will be there functioning in three or five years, so I am reticent about using online storage. It is a viable option, if you also have a local and offsite copy of all of your data.

Some people use safety deposit boxes. I don't. As a probate attorney, I have handled many, many probate cases and had a significant number of problems gaining access to safety deposit boxes. Also, because of the difficulty of gaining access to the safety deposit boxes, you are not as likely to make incremental backups. I suggest that because of data migration issues and accessibility, safety deposit boxes would not be my first (or second, or third) choice.

Whenever you make a copy of your files. Make sure the copy is able to be used and open. I am also skeptical of compressed files. With cheap storage readily available, there are no longer practical reasons for backing up files in a compressed format. There are still programs, such as PAF, that use a compressed file format for their "backup" files. I do not like this idea because over time the program that makes the compressed file, like Windows Zip, will change and may not work to unzip the files.

All computer storage is dynamic, that is, subject to change. I will address the issue of file migration in a subsequent post.

Wednesday, February 16, 2011

Online availability of records sweeps ahead

When I started posting online a few short years ago, it was big news in the Blogging community every time another large collection of records went online. Lately, posting records online has become so common that when huge collections are posted, there is hardly any notice. If there is one technological change that is affecting the way we do genealogy more than any other, it is the increased availability of online original source material. It is further apparent from comments to my last post that there are a significant number of genealogists out there who are not only unaware of the records being added but are resentful of the changes these huge databases make necessary. I will do a post in the near future on how to some degree advances in the availability of online genealogical records is driving some of the technological advances.

I decided to do a review of some of the larger websites that are continually adding content, mostly of original sources, every day, day after day. These are not in any order but collectively have accumulated a tremendous number of files, records, individuals or however they account for their records online.  If you haven't checked some of these sites recently, you are going to be surprised.

For a very extensive list of online digital collections go to the Online Education Database.  This is a very comprehensive list.

Here is a partial list:

Ancestry.com
There has not been a lot of fanfare recently about acquisitions by Ancestry.com, but the list of genealogy databases posted or recently updated is impressive.  I am not sure if the link will work without a subscription, but the list includes some huge databases added just since the first of the year. The spectrum of types of additions is impressive, with everything from military records to New Zealand Electoral Rolls. There does not seem to be any way of determining the actual number of new records, but the list is impressive none the less. Ancestry.com is a subscription website.

FamilySearch.org
FamilySearch does keep a count of the number of records added and it appears that there were close to 200 million new records in January, 2011 alone with likely an equal or greater number to be added in February. The records added by FamilySearch are free and fully searchable. These records include many that have never previously been available online. The records are being added from scans of the 2.4 million rolls of microfilm in the FamilySearch Granite Vault.

Internet Archive
Not generally identified with genealogical resources, this huge archive has some surprising records, such as tens of thousands of census records from around the world. There are also increasingly a large number of personal records, journals, photos, movies and other records of every description and type. As with the other large record sources, additions are made frequently. At the recent RootsTech Conference, the founder of the Internet Archive, Brewster Kahle, announced an initiative to digitize the Library of Congress. If you didn't know, the Internet Archive has the complete U.S. Census with free access.

Google Books
The last time Google announced any numbers with respect to the digitized books in it collection back in October of 2010, the number stood at 15 million. Since that announcement, many more millions of books must have been added.  A very rough search on the word genealogy returned 1,060,000 results.  The term "family history" returns about 790,000. This site is one to be familiar with.

Washington State Archives
Although limited to records from Washington State, this free collection currently has 99,410,409 records with 30,593,695 online.  Makes you wish you had relatives from Washington State.

There are many more sites that could be added to this list. If you have a favorite mega-library I have overlooked, please post a comment with a link.


Tuesday, February 15, 2011

FamilySearch hits a gold mine of luddite comments

When I wrote my Blog post entitled "The Deeper Meaning of RootsTech -- Modern Luddites Revisited" I had no idea of the gold mine of ludditic comments that lay right in front of my eyes. There are currently 553 comments to the December 13, 2010 post by Paul Nauta entitled "FamilySearch.org Website Changes - updated." The vast majority of these comments are revelatory in the depth of their angst over the change to a new website format that includes a huge amount of information, never before available. Here are a few of the more interesting comments with the senders' names omitted:
  • Everything is now so complicated, for instance why not say HOME at he top of the page instead of a picture of a tree?
  • I need sites that dont have to be redone all the time.Any ideas? [typos in the original]
  • This is terrible I cant find my previous information or even know how to navigate this updated site.
  • Please give us the old system back, this new site is unusable.
  • My job of helping others with their research has been complicated by the changes in this web site. What may seem easy to use by the originators, but it is NOT to the average individual or me, who is an experienced researcher and user of your past sites.
  • This site is SOOOO frustrating.....and my computer, brand new and fast, absolutly hates it. [typos in the original]
  • Why fix something that was not broken? I was just learning the old system and was very pleased with it. It was fantastic.
  • Just awful Everything I loved about the old site is now gone
I think that is enough to make my point. As of today, this horrible new site has 543 collections of historical records, most of which have never been available before in digitized form online. It also has the FamilySearch Research Wiki with 48,974 articles and growing rapidly.  It also has the newly added link to the FamilySearch Family Tech website with dozens of helpful articles. The updated site also has a fully developed search engine to provide access to the Family History Library Catalog which has all of the digitized books in its collection marked. Isn't that just horrible? The new site also has the complete Social Security Death Index. The old site was so much better with its lack of resources to original records. You could always find multiple copies of what you wanted in the secondary contributed records of the Ancestral File, the International Genealogical Index (IGI) and the Pedigree Resource File (PRF). Of course, the new file, as horrible as it is, also contains most of these records.

I cannot imagine what it is that some of these people are looking at? It could not possibly be the same website I get on my computer when I type in "familysearch.org." What do they think they were finding in the "old site" that they cannot find in the new site? The Ancestral File was and is a compilation of user contributed family trees. There are almost no sources listed and the information cannot be considered reliable unless the source is given. The same could be said for the entries in the PRF. Many commentators claim years of experience with the old site and claim "years of ease finding data," again, what did they think they were getting with the old site? Until the addition of menu links to the old FamilySearch Research Pilot and the Historical Books collection, there was virtually no primary information in the old site except temple ordinance records that have now been moved to the New.FamilySearch.org website. As pointed out above, the new site has a tremendous amount of information not available on the original site and now readily searchable.

I can only imagine the frustration of the FamilySearch team at reading this drivel in comments. Why are the commentators going back to look for information they apparently already have in their databases? Again, what are they looking for? Reading on down through the almost endless list of comments you would think that the updated website was difficult to use. It is not. Or that it lacked information. Just the opposite.

What this all boils down to is change. The old site, even though it lacked substance in the form of original source records, was familiar. On commentator complained that the Texas Death Records were not available on the new site. There are, in fact, three different Texas Death collections with one, Texas Deaths, 1890-1976 has 4,281,854 records. These records were not and are not available in any form on the old FamilySearch.org website. So where is the complaint?

Just for a test, I put in just the name of my Great-grandfather Henry Martin Tanner on the first page of the updated site. With just his name only, I found 26,714 results. However, the first record found was a link to his death certificate. With two clicks, I had a copy of the original Death Certificate from 1935. That was not even possible on the old site. How is this more difficult that the old website? I cannot imagine.

OK, I cheated. I used my Great-grandfather Henry Tanner. Let's try someone really difficult to find, my Great-great-grandfather on my mother's side, Samuel Linton. No dates, just a name. Here we go! Hmmm. This is more difficult than I thought. There are 5426 results. Maybe I should put in the fact that any event occurred in Ireland? (Where he was born). A few entries down the page, there was the entry for him in the 1900 U.S. Census living with his daughter, my Great-grandmother Mary L. Morgan with her two children, Linton and Harold. This record was and is not available in the old website.

Wait a minute, from reading all these comments this should be next to impossible to do what I just did. I must be looking for people who are too easy to find. Let's try someone really obscure. My Great-great-grandfather Ove Christian Oveson who was born in Denmark. If I believe the comments, it should be really, really hard to find him, since he was born in Denmark and all. Oh dear, there are 234 results. I don't know if I can stand having all these choices. Hmmm. I guessed right. This is turning out to be a lot more difficult than the other two. Oh, guess what. I just remembered, he spelled his name as "Oveson" with an "o" even though he was from Denmark. I was looking for "Ovesen." There he is in the old Ancestral File record. First up. No problem. Now what could all those detractors be talking about?

As Click and Clack the Tappet Brothers would say, "Now it is time to play, stump the chump." Can the new site really be this easy to use? Am I missing some deeper significance? Let's look for someone not even related to me. Somebody famous. How about Abraham Lincoln. I seem to remember something about him being in Illinois. Putting the name and place into the search fields I come up with 380 results. Being very lazy, I decide to add one more piece of information and see what happens. I put in that he was born in Kentucky. There is he is, with the indisputable fact that he had two wives, one of whom was named Ann Rutledge. (Do your history folks, this is getting to be a little bit ridiculous). The information on his marriage to Ann Rutledge comes from that reliable source the Ancestral File. I am glad all those commentators are satisfied with that source and didn't want any more sources.

I must being doing something wrong. How could I find all these names so easily in such a rotten database? Now here is the real test. Can I find Abraham Lincoln in the 1860 U.S. Census? This time I am going to the Historical Records Collections directly. I click on Canada, US and Mexico and get a list of collections. I click again, this time on dates from 1850 to 1899.  I scroll down the page. (Wait a minute. I just realized the problem. All of these people do not know how to scroll! Hmm. That must be it). I find the U.S. Census for 1860 and put in the name Abraham Lincoln (actually quite a common name). I also add Illinois as a place. What a surprise. He comes up as the second entry.

So far, I haven't been able to see what all the fuss is about. Every person I have looked for has come up as one of the first few hits. Everyone. I must be doing something very wrong.

I cannot for the life of me find the difficulty or complication. Each time all I did was put in a name and records came right up. What do the records need to do? Wave a flag and shout?

With all that ire and venom directed at the updated website, there must be a deeper problem. Something I have overlooked in finding my ancestors and other people so easily. What could it be? Someone, anyone out there in the cyber world, let me know what I am missing. Why do I think the updated FamilySearch website is ridiculously easy to use? There must be something wrong with the way I am searching, maybe it is because I am trying with unique names like Ove Oveson or Abraham Lincoln? Anybody got any ideas?

Who are these people who take the time to write incomprehensible complaints about a program that is this easy to use and so full of information?

One more quote before I go, "This is not as user friendly as the last site, I hope that it gets changes back very soon I can not find anything." Who are these people? Where did this stuff come from?