US Intelligence Unit Launches $50k Speech Recognition Competition 62
coondoggie writes The $50,000 challenge comes from researchers at the Intelligence Advanced Research Projects Activity (IARPA), within the Office of the Director of National Intelligence. The competition, known as Automatic Speech recognition in Reverberant Environments (ASpIRE), hopes to get the industry, universities or other researchers to build automatic speech recognition technology that can handle a variety of acoustic environments and recording scenarios on natural conversational speech.
How about this one? (Score:5, Insightful)
Re: (Score:1)
Re: (Score:1)
Re: (Score:1)
"Go fuck yourself."
Better idea. Enter the competition, use already well-developed commercial software (or write a program to average the results of several commercial programs), and easily win the competition. It's not like anyone is going to create software worth millions and give it away for a tiny prize.
Re: (Score:2)
The would better start to recognize peoples right to free speech.
Which includes the right of not being pestered by the government (as in: put under surveillance) for it.
Eh arent they trying? (Score:3, Interesting)
Seems they are appealing to any random developer who might have an idea.
Re: (Score:3)
All the speech recognition software I've used has relied on a controlled environment (e.g. yelling directly into your phone with almost no reverberation, no competing conversations, very little background noise).
Reverberation *should* be the easiest kind of noise to remove, because it has a simple mathematical model:
S(t) = signal(t) + f(signal(t - delay))
Where f() is a pretty simple function that may attenuate some frequencies mor
Re: (Score:3)
All the speech recognition software I've used has relied on a controlled environment (e.g. yelling directly into your phone with almost no reverberation, no competing conversations, very little background noise).
...
Modelling all the other kinds of background noise is much, much harder.
I agree, but the issue is this problem is harder than those that industry leaders are putting billions of dollars of R&D money into. What is $50k really going to accomplish? There are Kaggle competitions that pay out more than that for far more trivial problems (like a marginal increase in CTR prediction).
Re: (Score:2)
It's not that simple, a reverberant space can have dozens of different discrete delay taps, add secondary (and tertiary, etc) reflections and the resulting spectral envelope is just a fog with an effectively continuous system of delay. Also keep in mind that all "functions that attenuate frequencies" are themselves just delays whose length is a function of a parti
Re: Nope. (Score:2)
Only 50k to sell my soul for having them spy on more people... including myself?
Nope.
Of course not you - but the kinds of people who will submit are going to get job offers from the NRO. They are willing to make that deal, they're not bright enough to run off to industry, and they might have a glimmer of talent that cannot be cultivated in the university system. Plus, $50k isn't enough to quit and start a company, so it's a well-considered recruiting effort.
50k? (Score:2)
Call Nuance [wikipedia.org]and tell them you are going to make a money injection in their R&D dept.
I'm sure your 50k will make a real impact, when added to their 1.9 billion dollar revenue.
Re: (Score:1)
Thing is, every huge company has a core of an idea (perhaps built by the founders on a weekend), that they're just milking for all its worth... the $50k might motivate a lone wolf developer to build something that's qualitatively better than the multibillion dollar's core idea.
For example, right now, all sound is filtered, transformed (frequency bands), quantized, and then those values are used to train a hidden markov model... that works for speakers in a quiet room---but doesn't for noisy environments or
Re: (Score:3)
Thing is, every huge company has a core of an idea (perhaps built by the founders on a weekend), that they're just milking for all its worth... the $50k might motivate a lone wolf developer to build something that's qualitatively better than the multibillion dollar's core idea.
You may be right, let's offer $50k to whoever sends another probe to a comet. Sure it cost $1,4 billions to the ESA but a lone wolf could find a qualitatively better way to do the mission. By February 4, 2015.
Slashdot is the last place where I expected to see an extremely difficult problem underestimated just because it's a computing problem.
Listening through noise or interference (Score:2)
I remember a demo out of IBM, I believe, for recognizing controlled vocabulary in high-noise environments. It handily OUT-performed humans -- listening to the test audio, you couldn't really be sure there was a human voice at all, but the software detected and interpreted the speech with high accuracy.
This demo would have been circa 2000. I can't help imagining that there's been more progress since then.
The proposed task, where the interference is correlated with the original sound, seems like fertile groun
Re: (Score:2)
Well if you convolve a signal with a 1Hz sinc wave the signal is "replicated and redundantly presented" but provably most of the information in the original signal is now lost. Interference correlated with the original sound is a convolution. It destroys information, it has to. It makes the problem harder, that's why our brains can't handle it, although our brains have context-based processing which allows us to recover a lot more than a system without that.
Perhaps. As you can surely tell, this is well outside my expertise.
Also, our brains are not hard-wired.
Forgive my imprecise wording. Brain wiring is malleable, but there's a lot of built-in structure, especially around perception and language processing.
Re: (Score:2)
Mammalian auditory systems actually have a lot of wiring that seems dedicated to processing reverberation.
I'm not familiar with the IBM demo you mention, but the key there
Re: (Score:2)
I'm not familiar with the IBM demo you mention, but the key there is the controlled vocabulary. It was probably also trained on the speaker's voice. Those are huge constraints.
I'm remembering that it was controlled-vocabulary, but speaker-independent. I think it was trained on spoken digits -- a very small vocabulary. It's been a long time, and I may be misremembering even the most basic details. Still, it was impressive to hear it picking out numbers where all I could hear was noise.
Out of touch with reality (Score:3)
So they want a complex problem solved in 2 months (first test on Feb 4 and there are holidays inbetween), for which they will pay a relatively low amount and only to the winners. Even if the result wouldn't be used for spying, I don't think there would be many takers.
Re: (Score:3)
I am sick of these "challenges" that effectively try get programmers to work for effectively well below market rates. As if we're like children, a "challenge" is supposed to make us set aside months or years of income to work on a really difficult problem that if we had to actually go out and do for a company in the job market, we'd be paid $100K/year or more. I think they probably attract young people who don't understand the value of their own time or skills, or who are more easily lured by childish notio
Re: (Score:2)
I am sick of these "challenges" that effectively try get programmers to work for effectively well below market rates. As if we're like children, a "challenge" is supposed to make us set aside months or years of income to work on a really difficult problem that if we had to actually go out and do for a company in the job market, we'd be paid $100K/year or more..
You're completely missing the point. They've found the Stargate and egyptologists are a dime a dozen. They need to form an elite team of programming and AI experts who will decode the symbols on the Stargate and defeat Apophis. This is just a fancy recruitment test.
Re: (Score:2)
Then they should code the tests into the next call of duty game. They can call it. Call of Duty :Prometheus. And features alien worlds and starships.
Re: (Score:2)
This is just a fancy recruitment test
I don't think I've missed the point, as I'm saying the same thing - I just think it's a lousy way to do recruitment. Analogy time: Say you want to hire a sex worker. Here are two methods:
1. Go find one that looks reasonable, initiate a negotiation. If you can find a mutually agreeable rate, hire her, otherwise continue looking for another one.
2. Issue a "challenge" to all sex workers. Declare that every day for the next 30 days, every applicant must give you a free b
Re: (Score:3)
Re: (Score:1)
You might have a look at the IARPA releases on this, especially https://www.innocentive.com/ar... [innocentive.com]. Programmers are *not* being asked to release their software rights: "To receive an award, Solvers will not have to transfer their IP rights or grant a license to the Seeker – the purpose of the Challenge is to gauge how far recent advances in speech recognition have come in solving this important problem. With broad participation, this Challenge has the potential to provide IARPA with insights on the be
Re: (Score:2)
So they want a complex problem solved in 2 months (first test on Feb 4 and there are holidays inbetween), for which they will pay a relatively low amount and only to the winners. Even if the result wouldn't be used for spying, I don't think there would be many takers.
Relatively low amount? For $50k it would have to be coded by volunteers and prison inmates.
"It's breaking rocks with a hammer, being stabbed in the laundry, or coding the speech recognition thing."
"Hmm, the laundry thing seems superfun but I'll pick hammering rocks. Give the coding gig to the guys in death row. They have nothing to lose anyway."
Voice recognition - AI (Score:2)
Given my own personal experience with voice recognition, it's not a problem we can throw money at. We can throw money AWAY trying, but we haven't improved much in many, many years of trying.
I don't have a particularly poor speech, or unusual accent, and English-speakers all understand me - even foreign English speakers like the one I live with. But speech recognition has always been an absolute flop unless I want to learn how to talk to the computer, which is the exact opposite of what I want to happen.
Si
Re: (Score:2)
Maybe the problem isn't with the AI techniques we're using, it's with the FFT.
FFT assumes a very periodic, stable signal. It doesn't handle transients well at all.
US Intelligence (Score:2)
When US Intelligence says something so clearly stupid, you always have to look for the subtext. The hidden message. The truth crouching behind the apparent idiocy.
In this case, the hidden message seems to be "we are incompetent in even the simplest basics of our main task".
Re: (Score:2)
That is indeed the most plausible explanation...
Re: (Score:1)
Can you explain (without revealing your own stupidity) what you think is so stupid about this?
Saw this coming (Score:3)
Competitions (Score:2)
As usual with competitions like this, you shouldn't settle for the prize money if you develop such a thing because its worth quite a bit more.
Re: (Score:1)
Go read the solicitation. They thought of that. It won't work.
(pinky to mouth) (Score:2)
Fifty THOUSAND dollars!
Re: (Score:2)
Nice one. You beat me to it.
omission (Score:1)
Coincidentally, this competition, by its very introduction also reveals a method for making massive automated eavesdropping difficult. Unless it produces a success, that is.
Contest? (Score:1)
My Application (Score:2)
https://www.youtube.com/watch?... [youtube.com]
intelligence? (Score:3)
Readying Mechanical Turk...FTW? (Score:2)
Let's see...for $50K...I could probably write up a quick mobile app ($1K) that feeds microphone input into a streaming acceptance service on a server ($3K), that chops it up into wav files for Mechanical Turk processing. Fund that long enough to pass the POC stage ($2K), ride some odds (25%) and cash the check before the tech collapses = $6K for possible $12.5K win = $6.5K possible profit? Er...still no.
ASpIRE? (Score:2)
This is horrific (Score:2)
Where do people who would do this come from? Is it child abuse?