
|
PROCESSING SKETCHES Poem Splosion Folding Bloodstream Pull Well Decoplanes Colorstorm CircleForms CircleForms2 Polyline Bouncelines Maeda lines SVG EXPERIMENTS SVG Links Amazing Stuff
Archives |
Speech In, Speech Out
Speech Recognition
Continuing on in the vein of software interfaces, I just played with a fun little demo for a game that uses speech recognition as its sole interface. I was pretty impressed with the accuracy of recognition; they made their task easier by limiting the vocabulary, but they’re planning on running this on PlayStation 2 hardware. It’s still not quite within reach, but we’re getting closer and closer to my ideal of an interface that comes close to being transparent. Yes, I’m aware that even with really good speech recognition that there’s a lot of work that would need to be done with respect to implementing a decent conversational interface, but hey, every little step counts. Anyway, the game is called Lifeline. I’m excited by the interface even if the game itself doesn’t look like it rises much above a run-of-the-mill shooter. Still, the possibilities opened up here are amazing, for all sorts of games. Even action titles could benefit from the ability to access thousands of commands with just a spoken word. Text To Speech After taking a look at the Lifeline demo, I followed a reference to the company who made the speech software, Scansoft. This company makes a lot of interesting software, but the bit that blew me away is called Speechify. They even have an interactive demo (it reads your text in a number of different voices) that astonished me. I’ve long been frustrated at the state of the art in this area, but it looks like these folks are starting to get it right. So go right now to the interactive demo of Speechify and be amazed. If you really want to freak yourself out, paste in some bits of your own writing. I tried it with a few blog entry fragments and was quite impressed (and a bit unnerved). Sure, there are rough spots here and there, but I think I could listen to one of these voices for some time and not get driven up the wall. An interesting bit is that I found the English and Australian voices to be more tolerable than the American voices. I imagine that it’s because getting speech tonal patterns right is hard to get right, and the more familiar you are with a particular mode of speech, the more you’ll notice it when they don’t get the subtleties right. However, if you can cast the speech in a more unfamiliar light (say, using an English accent and some associated tonal patterns), it ends up sounding a lot more plausible. That’s my guess, anyway. posted on 3/06/2004 11:14:00 PM |