Saturday, January 06, 2007

Voice recognition

Richard Powers at the NYTBR on having dictated his last half a million words to a computer while lying in bed. An interesting and thought-provoking piece, go and take a look. Here's my favorite paragraph:

Writing is the act of accepting the huge shortfall between the story in the mind and what hits the page. “From your lips to God’s ears,” goes the old Yiddish wish. The writer, by contrast, tries to read God’s lips and pass along the words, via some crazed game of Telephone, to a further listener. And for that, no interface will ever be clean or invisible enough for us to get the passage right. As Bede says of Caedmon, scrambling to transcribe the angelic hymn dictated to him in a dream: “This is the sense, but not the words themselves as he sang them in his sleep; for however well composed, verses cannot be translated out of one language into another without much loss of beauty and loftiness.”

I like the idea of working only in bed (in fact I do mostly work in bed, isn't that awful?!?) but I am wedded to a method that involves writing everything out first in longhand from start to finish without skipping around; there are other things I like about it, including the fact of its being a longstanding and productive habit, but the thing I like most is the effect I believe (hope?) it gives to the final product of a kind of sweeping start-to-finish arc, like a lecture or a sermon with a really clear spoken shape that has to be take-in-able through the ear.

After a rave review in the NYT Circuits section this summer of a new voice recognition program called Dragon Naturally Speaking, I bought a copy in the hope that it would streamline various things: transferring the handwritten drafts to the computer, for instance (though I am a very fast--fast but inaccurate--typist so this is strictly speaking more useful because it's less stressful/tiring rather than because it's actually more convenient), but also for things like comments on dissertation chapters or reader's reports on manuscripts or whatever.

I do think the software's amazing. Example: I spent only about 10 minutes training it, and then it asked for permission to read through all the word-processing files on my computer. I gave it permission. Then I started dictating from the draft pages of the last chapter of my academic book, which includes a bunch of stuff about Gulliver's Travels. And so there I was reading away, and I came to the word Houyhnhnms and I could not believe it, the word just popped up there on the screen in a perfect transcription. So you can see it is well worth getting (the wonders of the modern age!)--in the end I don't think it's so good for transcribing academic writing, there are simply too many quotations with unorthodox capitalization and spelling in the stuff I write and it makes more sense to get it right the first time and with the use of the eye, but I think it will be very useful for the more you-think-it-and-it's-just-a-nuisance-to-write-it-down kind of writing. (My paper comments are notoriously illegible, I always have to meet in person and go through page by page, so I must get more systematically on the Dragon Naturally Speaking thing when I'm giving suggestions for revision.)

1 comment:

  1. Speech Recognition is definately going to be the interface of the future. Just think, instead of keyboard/mouse, you use Animated Avatar/Text-to-speech/voice recognition. It's a whole new way of interacting with the machine. Check out the video demonstration at: YouTube.

    This software runs on any WindowsXP computer. It works great--even on my laptop which is only 600Mhz. Definately work further investigation. This is a new concept: audible-computing.