Vexatious Voice Recognition
One of my colleagues recently sent me a link to this YouTube clip --one programmer's attempt to write a brief segment of computer code using Windows Vista's speech recognition.
As you might expect, things don't go well. Vista keeps interpreting his editing commands as text that he wants to "type," while he keeps making the mistake of talking to the computer ("thank you"). The cursing sets in at about the fourth minute, and eventually the exasperated user breaks down and starts editing with the keyboard.
The whole thing is both hilarious and profoundly cringe-inducing, like a computerized equivalent of the answering-machine scene from Swingers. It's funny not so much for what it shows about malfunctioning computers--this demo was almost bound to fail, since the vocabulary and grammar of the Perl programming language scarcely resemble that of ordinary English--but for its exhibition of how quickly anybody can lose it when faced with militantly uncooperative software. Admit it: You've been this guy!
For what it's worth, I tried to repeat the exercise myself on a Vista machine at home. I breezed through the first line easily--and was feeling pretty good about myself at that point--but two-thirds of the way through the second line, the wheels came off the bus. Vista started treating my editing commands as text to input; after spending a few futile minutes flailing away as the screen filled with "delete that" and "undo," I abandoned the exercise.
The text of this Perl script, as near as I could make it out from the YouTube clip, appears after the jump; if you care to run your own test in Vista (or other speech-dictation software), please share your report in the comments.
open(INFO, '- );@input =
; close (INFO);
($string, $times) =@input;
print $string x $times;
By Rob Pegoraro |
May 1, 2007; 2:21 PM ET
| Category:
Windows
Previous: Miscellaneous Monday-Morning Updates |
Next: Patent Progress
Posted by: Sara | May 1, 2007 3:16 PM
I'm a casual user of Dragon 8, not an expert by any means. I was able to produce the Perl code with just a moderate amount of bumbling around. I've spent a few hours training Dragon to my voice. Here's a capture of the results:
Posted by: Jay | May 1, 2007 7:01 PM
But then there's this one, that actually works (using NaturallySpeaking):
http://www.youtube.com/watch?v=A7f9Iik3q58
I developed some earlier techniques, visible here:
http://www.voicerecognition.org/developers/jepstein/pbvdemos/
Posted by: Jonathan Epstein | May 1, 2007 9:51 PM
Yeah, Dragon is the best on the market. Vista is not going to put them out of business anytime soon.
Posted by: James | May 1, 2007 9:52 PM
That's one 'vexatious' misspelling, Rob. No copy-editors watching over the blog? I enjoy your work -- keep the interesting columns coming.
Posted by: George | May 2, 2007 1:43 PM
Well, I'm not partial I suppose, but do find a Natural Language bridge interesting with these results to program flight control. A user-generated dictionary, or 'voice macro', may seem to translate it better? (Pentium-based, I think)
MANUAL VERSUS SPEECH INPUT FOR UAV
http://www.hec.afrl.af.mil/Publications/HFES03VoiceFinal%20version2.pdf
Posted by: DBS | May 2, 2007 1:50 PM
Picky, picky :) That misspelled headline has been corrected. I thought Firefox's spell-checking would have warned me about this, but I guess it doesn't work on every text-input field.
- RP
Posted by: Rob Pegoraro | May 2, 2007 2:06 PM
The comments to this entry are closed.











"The vocabulary and grammar of the Perl programming language scarcely resemble that of ordinary English"
You could say that about governmentese acronyms too... or engineering formulas.
As someone who is hearing impaired and used to work for the government, I came across a lot of situations where speech recognition software didn't think people were speaking English. I had much better luck using a real person and a stenography machine when I wanted to 'caption' a lecture or meeting.