Pages

Tuesday, March 12, 2019

What is happening today with speech recognition?


Over the past few years, speech recognition software has moved from obscurity to the main stage. We are almost saturated with automatic answering software when we call larger businesses and speech recognition on smartphones has also become ubiquitous. But is this a serious advance or merely a novelty? Can you use speech recognition to do real work or only for casual and somewhat quirky communication?

I have been assessing the capabilities of speech recognition for many years and have been writing about my experiences off and on during that time period. Maybe a little history might help in giving some perspective.

Speech recognition has been under development for many years. You might recall HAL 9000 the computer in the 1968 film 2001, A Space Odyssey. Here we are about forty years later and a long time past 2001 and the capabilities of the HAL 9000 still seem like science fiction. Development of speech recognition software began long before 1968. The first speech recognition software/hardware combination was developed by Bell Laboratories in 1952 and could recognize numbers (digits) spoken by one voice. In 1962, ten years later, IBM demonstrated its Shoebox machine at the Worlds Fair in Seattle. I saw that demonstration and this jump started my interest in speech recognition software. The Shoebox system recognized 16 spoken words and the digits 0 through 9.

By the 1970s, speech recognition had advanced with the Department of Defense's DARPA Speech Understanding Research from 1971 to 1976. The program's system was called "Harpy" and could understand about 1011 words. Progress in speech recognition was incremental. I was aware of Apple's limited integration of speech recognition in 1993 and about that time in 1990 Dragon Dictate was launched by Dragon. My introduction to this product probably occurred with Dragon Dictate for Mac. The implementation of editing commands changed the products from limited use novelties into useful product.

However, my full use of speech recognition developed slowly over time. By the way, see if you can tell when I switched from manual typing to speech recognition in this post. However, in October of 2018, Nuance, the present developer of Dragon Naturally Speaking software announced that it was discontinuing support of Apple products.

The key factor in using speech recognition for day to day document production is the software's ability to edit. Even the most advanced software I have ever used is still cranky. Smartphone-based systems are little more than toys. Using voice to write text messages is always a source of amusement. Good voice recognition requires a very quiet environment and a speech pattern that clearly articulates each word. Granted, the entire idea of using speech recognition has moved to the mainstream of our computer usage but the advances in technology are still only partially implemented. The discontinuance of Dragon Dictate for the Mac is a serious setback.

Both Microsoft and Apple have speech recognition software as part of their operating systems. Each time there is a system upgrade I try to use the latest speech recognition software from Apple and from Microsoft and still find them both virtually lacking in editing ability. For the time being, I will continue to use Dragon Dictate until it becomes inoperable because of operating system changes. Hopefully, during the time while the program still operates other systems will be developed for Apple will improve its basic speech recognition program to implement adequate editing tools.

If you are using a Windows operating system, the best program is still Dragon NaturallySpeaking. Unfortunately, some new speech recognition programs are being implemented as subscription services with payment by the word or for time spent in dictation. Right now, it looks like Google Voice Typing is a good alternative with the limitation that it only works with Google Docs and the Chrome browser. 

I will keep looking.


No comments:

Post a Comment