Some people eat, sleep and chew gum, I do genealogy and write...

Saturday, July 7, 2018

Using Voice Recognition in Genealogy: Names and dates and places


Over the years, I have used voice recognition software off and on, always hoping that it would become the solution to quickly entering information so that I could avoid typing. Most recently, voice recognition software has become ubiquitous with smartphones and apps such as Siri and Google Assistant or one of the many other such programs. For example, when I get a phone message on my iPhone, the message is automatically transcribed into text.

Significantly, all three of the major operating systems from Microsoft, Apple, and Google include a voice recognition program. Unfortunately, the availability of more sophisticated voice recognition software is limited to only one major program called Dragon Naturally Speaking for Windows and Dragon Dictate for Apple OS X both from Nuance.com. I say unfortunately because the operating systems programs lack any significant editing capabilities and there appear to be no other programs competing with the expensive programs from Nuance.com.

Because I see a distinct advantage in using voice recognition for a lot of reasons, I keep upgrading my copy of Dragon Dictate even though the upgrade prices seem outrageous.

Now, supposing that you have spent your money on purchasing a voice recognition program such as the ones from Nuance, how well do they work? This has been the issue since the beginning of the development of voice recognition software by IBM many years ago. In fact, Nuance.com is the present developer of the voice recognition software originally developed by IBM. However, IBM is still in the market and is selling Watson Speech to Text as an API add-on for software developers.

Two factors severely limit voice recognition: words that sound the same but are spelled differently and background noise. For genealogists, the real challenges are names and places. Most of the text that we deal with contains a fair amount of both. The programs usually get dates correct.

How well does an expensive commercial program such as my version of Dragon Dictate do with names? Here are some examples. Here is a list of five names from my ancestors:
  • Samuel Linton
  • Marinus Christensen
  • Adeline Springthorpe
  • Margaret Turner
  • Sarah Foscue
Here is what I get when I read these names using Dragon Dictate:

  • Samuel Benton
  • Marina's Christiansen
  • Adeline spring Thorpe
  • Margaret Turner
  • Sarah Foster you
There is a way to train the program but it would be extremely tedious to try to enter thousands of names. Switching to places, here are five place names, again, from my ancestors.
  • San Bernardino, Los Angeles, California, United States
  • Ramsey, Huntingdonshire, England, United Kingdom
  • Brookfield, New South Wales, Australia
  • Whittlesey St Mary & St Andrew, Cambridgeshire, England
  • Toquerville, Washington, Utah, United States
Here are the same place names entered using the voice recognition software:
  • San Bernardino, Los Angeles, California, United States
  • Ramsey, Lincolnshire, England, United Kingdom
  • Brookfield, New South Wales, Australia
  • Whittlesey St. Mary and St. Andrew, Cambridgeshire, England
  • Tocqueville, Washington, Utah, United States
The second best looks superficially correct. However, the errors are not so noticeable and would take more time to correct than it would take to retype the list. Here is what I get up I try to say Huntingdonshire several times.
  • Lincolnshire
  • Huntington Shire
  • Huntingdon Shire
  • Huntington
All in all, the place names come out much better than the names of the people. Even though the text has a high degree of accuracy using voice recognition, it is imperative that any text dictated be carefully edited. Here is an example of some text from a biography. I will mark the typos after I dictate the text.
Thomas Parkinson was born on December 11, 1830, in Cambridge Shire, England, the second son of James Parkinson and Elizabeth chattel. His father, James, was born in the not-too-distant hamlet aforesaid, Huntington Shire England and was also a farmer, as was his father before him. Students of English history no way to wealthy economic struggle in the 1830s and 40s, especially for the tenant farmer. Until that time agriculture have been the nation's mainstay, but with industrialization the best a man could hope for was steady work.
Here is the edited version of the same text.
Thomas Parkinson was born on December 11, 1830, in Cambridgeshire, England, the second son of James Parkinson and Elizabeth Chattle. His father, James, was born in the not-too-distant hamlet of Farcet, Huntingdonshire, England, and was also a farmer, as was his father before him. Students of English history know only too well the economic struggle in the 1830's and 40's, especially for the tenant farmer. Until that time agriculture had been the nation's mainstay, but with industrialization the best a man could hope for was steady work. 
When I am dictating using voice recognition, I usually watch what is entered very carefully and make corrections as I go along. However, you can see why some would conclude that using the program is not worth the effort to train and edit it. 

However, as voice recognition software becomes more ubiquitous it is entirely possible that it will become more accurate and therefore more useful. Right now, but still has a ways to go.

No comments:

Post a Comment