Some people eat, sleep and chew gum, I do genealogy and write...

Wednesday, November 20, 2013

Is Handwriting Recognition the Holy Grail of Genealogy?

"Massachusetts, Plymouth County, Probate Estate Files, 1686-1915," images, FamilySearch (https://familysearch.org/pal:/MM9.3.1/TH-1961-26979-18706-60?cc=1918549&wc=M9S5-2FG:n612103331 : accessed 14 Oct 2013), Plymouth > Case no 21895-21929 Warren, Nathan-Washburn, Benjamin > image 23 of 533.
For me seeking Holy Grails and Lineages back to Adam fall into similar categories, but the idea of a blockbuster jump in technology is something I have lived with most of my life. For me, there are two areas of computer technology that fall into the category of game changing or earth shattering or whatever analogies you would like to use. These two are voice and text recognition.

I have written recently about my experiences with voice recognition. With advanced software technology and computers with virtually unlimited computational speed, voice recognition (VR) is a reality. It has gone past the interesting experimental stage into a practical daily-use type tool. Presently, my iPhone and other devices talk to me on a regular basis and they haven't carted me away yet to the funny farm.

What about optical character recognition or OCR? I haven't said much about OCR for a while, mainly because it is ubiquitous. I guess I have taken it for granted for some time now. I can take almost any page of printed text and turn it into a computer text file in just a few seconds. As genealogists, we live with massive amounts of OCR text in the form of digitized books and newspapers. Most large businesses use some form of OCR to handle routine mail.

But what about handwriting? Some handwriting recognition software already is in use on a daily basis by such entities as the U.S. Postal Service's system for reading handwritten addresses and ZIP codes. But reading a document such as the probate inventory shown above is still an unattained goal. But it might be getting closer. I received a short note from Cliff Shaw at Mocavo.com. Apparently, Mocavo.com is working on some breakthroughs in handwriting recognition. Here is a quote from Cliff to me:

A little over a year ago, Mocavo acquired ReadyMicro and the incredible mind known as Matt Garner. One of Matt’s lifelong passions and curiosities is to enable computers to read historical handwritten documents to bring genealogy search to the next level. It’s well known in the genealogy industry that historical handwriting recognition is the Holy Grail – the single largest technological advancement that would enable more content to become accessible online (except for maybe the invention of the Web). For the past year, we’ve joined with Matt to tackle this very hard problem, and have finally made enough progress that we can begin to report on it.
 Mocavo.com is making an announcement today about their progress and I will let you read what they have to say in the complete press release or post.

I know they are not the only ones working on this technology, but they may be the only ones in the genealogy community that are pushing this technology in a practical way. For my part, I hope they keep at the problem and solve it. I have tens of thousands of pages of handwritten letters that could benefit from this technology.

3 comments:

  1. I'm very sceptical about handwriting recognition James. It seems OCR cannot accurately interpreted newspaper print, even when near words are in frequent use and listed in the dictionary [There must be a whole topic there for one of us].

    Back in the early 1980s, one of my colleagues was experimenting with an early VR system, and trying to train it to understand him. Thus guy was from Liverpool, and spoke extremely quickly with a very strong regional accent. Our team decided that if that VR system worked then we would stick a goldfish bowl over his head and incorporate the VR system to translate so that we could all understand him. :-)

    ReplyDelete
  2. This is wonderful. I mentioned writing recognition in one of my posts for RootsTech 2013. FamilySearch had sponsored some research on it in 2011 as I recall though this is one area of research that has been too tight-lipped in the genealogical community. The presentation at RootsTech wasn't evenin the schedule.

    Thanks for your post.

    ReplyDelete
  3. In addition, I recently bought Dragon Naturally Speaking. It is terrible and worthless on cassette to digital recognition for deceased relatives voices. I hope to use it to read some family stories into rather than having to type them from the handwriting. If it works, it will be worth the purchase. I think it will be ok with my own voice directly to the program.

    Do you know of anything better that can be bought or where the idustry is going for voice recognition?

    I don't like transcribing hour long interviews so genearally I haven't.

    ReplyDelete