Consonants Not Required 139
billybob2001 writes: "A report at the
BBC explains how voice-control of computers can be more successful using grunts and sighs, as "voice recognition programs often failed to accurately capture words". Dr Takeo Igarashi, of Brown University suggests the use of "ahhhh" for skipping tracks on a cd, or adjusting tv volume, but I wonder what the effect would be on pr0n sites? Another suggestion is "uh oh" for undo. Perfect for online banking. Is this going to confuse your system or what?"
It's cute, but... (Score:5, Interesting)
What I'd worry about is whether these unarticulated sounds sound more like background noise than articulated speech; if so, then you've made the situation worse by making it harder for the computer to know when you're talking to it.
On "uh oh": Dragon Dictate (discrete speech recognition from a few years ago) used "oops" for telling the SR system when it made a mistake; it was reasonably easy to distinguish from words that you actually wanted to put into your text with any frequency.
No, this is serious academic research! (Score:3, Interesting)
Seriously. I have colleagues that work on this type of thing:
"Sound Symbolism in Conversational Grunts in English"
"The Challenge of Non-lexical Speech Sounds"
"Issues in the Transcription of English Conversational Grunts"
http://www.sanpo.t.u-tokyo.ac.jp/~nigel/publicati
Tim Allen will love this (Score:3, Interesting)
This really strikes me as the verbal equivelent of Palm's Grafitti - if normal interactions (printing/speaking) is too hard, make a simplified interface (Grafitti/grunting) that isn't.
I don't know, but I already learned one interface (typing) to make my computer's life easier. Why should I do all the work?
Bad idea from a linguistic standpoint (Score:5, Interesting)
But voice activated systems are stupid, anyway...speech is one of the slowest forms of human interaction, and is one of the few we have to actively concentrate on to perform. You know when people say, "Think before you speak?" That's because once you start speaking a large portion of your brain activity is devoted to doing so...it actually becomes harder to think about what to say next. Pressing a button or turning a dial takes practically no thought...which is another reason why a speech written in spontaneous draft still sounds better than one that is spoken aloud. If we convert machines to speach recognition, we're effectively asking people to interact with them in dumber ways. And can you imagine the logic involved with processing a fairly simple statement like "This check in my hand should be processed by you and in return i'd like fifty bucks in tens and ten one dollar bills." Since the command isn't linear, the machine not only has to recognize what each word means, but try and interpret them in queue. And if humans can't construct complicated sentences like the one above -- which any human over the age of about 4 can understand, before that kids can't identify the subject and object in complex sentences -- they'll be inconvenienced by speaking machines. Oh and for a simpler example, try this: "My pin number? 376 uhhhhhh...Forty-two thirteen...aaaaaaaaaaaand...is it six? no. Eight?...oh! oh! sixty eight!" A human can understand that...we'd be annoyed, but we'd get it.
Background noises deleted my HDD! (Score:4, Interesting)
How selective would the speech recognition be? If I was playing musing on that computer, would the computer pick up the tones coming in and start "doing stuff(tm)" on my computer? What about background noises? My friend's Jello Biafra spoken word CDs?
I won't even go there with my Saturday Morning Cartoon CD - Eep Opp Ork Ah-Ah (This means mail all of my friends a copy of my resume)...
Re:Typing vs. speech (Score:3, Interesting)
Ok, that sounds fair, but I guess you'd want to have some sort of benefit after you invest your time?
I just don't see this sort of interface to catch on for standard applications. I mean - imagine you are in an office with 20 people grunting at their computers, the noise they make is just going to be unbearable. That's got to be worse than that annoying guy who's checking his voicemail via speaker phone. *shudder*
From the article:
By increasing the pitch of your voice, the scrolling speed increases. When you stop speaking, the scrolling ends.
Can you imagine sitting next to a guy who uses this, and not have a headache after 10 mins?
Sheep (Score:2, Interesting)
Undo (Score:3, Interesting)
"Uh oh"
On another note, I knew a guy who worked with voice rec software where the delete-word command was "oops". Whenever he would watch another person typing and they would typo, he would instinctively say "oops". I'm guessing it's kind of how my writting went bad went I was using graffiti a lot. You get used to these quirky mannerisms you use to control the machines. Then you end up looking like a dork and annoying the people around you
Re:It's cute, but... (Score:2, Interesting)
Re:It's cute, but... (Score:3, Interesting)
When people speak, it is the consonants that matter. Ever try listening closely to someone with a pronounced regional accent? The vowels are all jumbled up but the speech is still intelligible. IIRC, people tried to teach gorillas to communicate using different grunts, and gave up in favor of sign language. Reason being that you *can't* string two different vowels together without a consonant in between and have it be intelligible.