Woudhuysen

Voice-operated: a word-of-mouth success

First published in Computing, March 2006
Associated Categories Innovation,IT Tags: ,

Speech-to-text tools are improving and can be a real boon for people who find typing difficult

Having long called for voice-operated interfaces, I’m finally using a speech-to-text application – to dictate this column, no less – and the surprising thing is, it more or less works.

Just before Christmas, I contracted a very painful case of tennis elbow in my right arm. Too much work at the keyboard. By the New Year, I was in physiotherapy. The physio told me to stop all work for three weeks. Given how impossible that was, I soon became desperate enough to research, finally, software packages that would allow me to work without using my right arm.

Poor accessibility has a price

To my surprise, Apple does not make its own speech-to-text software. To my greater surprise, however, IBM makes and Amazon sells IBM ViaVoice 3.0 for Mac OS X version 10.3. After waiting in vain for Amazon to deliver, I phoned the Apple Store in London’s Regent Street and was told they had one shrink-wrapped cardboard box of goodies left in stock.

At closing time, 9pm, one cold Monday night, I slipped into Apple’s very busy shop and bought the box for £90 – £10 more than one pays on Amazon.

For that, I got a CD, an 88-page instruction manual and a rather tightheadset made in China. Wrestling my Apple preferences from US English to British English, I successfully installed what turned out to be ViaVoice 3.2. An afternoon was spent training my machine to understand me by reading it passages from Alice in Wonderland.

After the training, I found that my Apple did understand most of the words I read to it. You have to keep your head up and enunciate very clearly, and slowly too. The machine adjusts for birds and passing aeroplanes if you trained it while they were overhead; but it does not adjust to incoming phone calls, coughs, or me clattering around my desk. Nevertheless, its initial performance is quite promising.

The main problem: the instruction manual. One needs to master some key commands – for example, “Begin spell”, or “Capitalise on”. Yet I have to turn to pages 41 and 21 to find out how. The manual’s “Welcome” section runs to 25 pages, and the rest makes a meal of navigating with commands, and of corrections.

It turns out that I want to navigate not by voice, but by using my left hand. Given how I am incapacitated, the main thing is that, most of the time, I’m now able quickly to dictate large amounts of simple text without making many corrections. IBM wants me to make those corrections by voice, on a special window, so that my computer could learn to understand me better. In fact, I again prefer to use my left hand.

The moral of my tail? As you can see from that mix-up of homonyms, there’s a lot of fun to be had with ViaVoice. It recognises words like homonyms and ViaVoice straight off the bat. What it still sorely needs is a better understanding of human factors, learning versus doing, and the realities of keyboard pain and keyboard use. Nevertheless, IBM has shown us the future, and for that my right arm is very grateful.

Share Button

0 comments

Comments are closed.