Thursday, May 06, 2010

Speech Unrecognition

This is one of the coolest things I have read in a long time. Worth reading the whole (very long) thing.

It is very interesting that there is a huge difference between speech to text, and text to speech. I often use Adobe's listen-to-text feature to be able to hear a dissertation or a paper read aloud to me on a long drive. Just plug my laptop into the auxilliary jack on the BMW (it's supposed to be for your iPod or MP3 player, but it works on anything that has an audio jack). Start up the PDF reader by clicking on "Read out loud" (which, bizarrely, is in the "View" menu on Adobe).

Wouldn't work well on a diss with a lot of tables and equations, but works fine for lots of theses in poli sci.

It is rather amazing that the reverse process, speaking and having the computer record the words, basically doesn't work at all. Optical scanning works quite well, with error rates below 5%. But audio speech-to-text... 80%, tops, and even then you are better off typing it straight from voice, for most purposes.

2 comments:

Shawn said...

with google now doing google voice voicemails automatically transcribed to text, I'm willing to bet the tech will come a long way in the next couple years...if you get a voicemail that isn't transcribed well (and, admittedly, a good chunk of mine aren't), you can 'donate' it to the goog for them to fix it.

it is pretty slick, though, in that it will send you an email transcription of the voicemail (and you can have a link in your gmail inbox to directly listen to the vm on your computer), and your vm's are available on the web at your googlevoice inbox.

works pretty well for when you're out of the country and don't want to bother with having incoming calls, but don't want to put an 'outgoing vaca message'.

anyway...yeah...goog will figure it out. :)

Unknown said...

Google is part of the world that works...