Some people eat, sleep and chew gum, I do genealogy and write...

Sunday, January 30, 2011

Genealogist's View -- OCR and Voice Recognition

The Holy Grail of the genealogy world (and the computer world in general) has always been OCR and Voice Recognition software. I remember visiting the Seattle World's Fair or Century 21 Exposition in 1962. IBM demonstrated the IBM Selectric Typewriter. As a side note, some time ago I mentioned a typewriter to one of grandchildren who is a teenager and he had never seen one. At the World's Fair, IBM had the Selectric hooked up to a microphone and demonstrated some very limited voice recognition software. Later, another milestone of public awareness of voice recognition to the form of Dave, the computer in 2001, A Space Odyssey. In addition, nearly everyone is now familiar with the Star Trek computer with whom the characters are always carrying on a conversation.  Voice recognition has now become so common that nearly everyone in the world has spoken with a computer when making a telephone call to a large insurance company. But the real question is, has this technology sufficiently matured to actually work with something as text/data intensive as genealogy?

The first thing you need to ask yourself if you start thinking about voice recognition is the names of your ancestors. Will the voice recognition software be able to differentiate between Kawolski and Kawasaki? Or how about Vanlandingham? I must admit that every few years I succumb to the siren call of voice recognition and buy another program to see if there is a way to really speed up all this typing and data entry. Every single time, I have found the error rate to be unacceptable. We went through the whole process at our law firm and tried mightily to get voice recognition to work. It lasted about a week or so. Even with the stupendous computer capacity we now have, adequate voice recognition is always just beyond our grasp.

Voice recognition works in environments where the same commands or limited dialogue are used over and over again. My last attempt was the transcription of one of my ancestor's journals. Even though I had the program working exactly correctly and used a USB digital microphone, the transcription still came out as hash. I spent more time trying to find all the word substitutions and typos than I would have spent keying in the whole thing in the first place. To those of you who are presently enjoying the benefits of voice recognition and are thinking about writing me and telling me how I have missed the boat, try dictating a will from the 17th Century or an old deed.

As an attorney, I cannot have an error rate. I have a big enough problem with typos that escape our review and cannot afford to have yet another unpredictable system messing up my documents. The same thing goes for all my genealogy. It is hard enough to be accurate without having a system with a built in error rate. Once again, the same thing goes for trying to use voice commands to control your computer. If I could not type, I would have no other choice, but I have never seen voice commands to work as quickly as I do with a track pad or a mouse or the keyboard.

I think it is important to insert a reality check at this point. Voice recognition is a blessing to those who cannot physically operate a keyboard and mouse. In those circumstances, the error rate is acceptable because there are few alternatives. My perspective is admittedly unique, I spend virtually all day and most of the night working on a computer. I simply cannot put up with an error rate.

OCR (optical character recognition) is in the same category as voice recognition. It works well with mail as long as there are real people checking any rejected items. It also works well depositing checks in an ATM except when it doesn't. The same problem exists with OCR as with voice recognition, an unacceptable high error rate (unacceptable as in genealogy) especially with faded text, odd fonts or weird angles. Don't even think about OCR of handwriting yet. How am I supposed to teach the program the words when I can't pronounce them myself? I have found OCR to work acceptably with typed material where accuracy is not crucial. I will use it if I really don't feel like typing in pages of printed text. It is always an available alternative because I always have a program or two on my computer scanning software.

On the other hand, voice recognition is another world. You can't just grab the microphone and start in dictating long text documents despite the claims of the software developers. Look at the ads. Dragon NaturallySpeaking 11 Preferred or Dragon Dictate has a "potential for 99 percent accuracy." Personally, if I am doing genealogy, I don't want 1 out of every 100 words to be wrong. I make enough typos on my own, thank you. If you read a few software reviews of the most popular voice recognition programs and if the reviewer is honest, he or she will always end up mentioning the editing software they use to fix the errors inherent in voice recognition.

But if you are like me, you will look at the price and then look at the time you are spending typing and you will end up trying the programs again and again. Who knows, maybe the next round of updates will overcome my concerns? Now, let's see how much is Dragon Dictate on Amazon?

2 comments:

  1. Years ago I helped proofread the Mayflower Descendant that had been OCRed to a CD. I did volume 20 and the error rate was more than 50%. It had a lot of problems with the letter "L" and the number "1", but the original Mayflower Descendant had been done on a typewriter which used the letter "L" for the number "1". It also had a lot of problems with the letters "u" and "n" because a lot of them were real faint at the top or bottom. Glad to see OCR is getting better, but still not good enough to be a lot of help yet.

    ReplyDelete
  2. I tried OCR and Dragon Naturally Speaking quite a few years ago, like you, I gave it up, and never turned either on again. I found both to be horribly frustrating, and who needs ANY more frustration??

    ReplyDelete