YouTube Makes Captioning Available To All 102
adeelarshad82 writes "Google's YouTube announced that it has moved its automatic speech-recognition and closed-captioning technology out of beta and has now made it available to the YouTube community at large. Most, if not all, YouTube videos now include a 'CC' button that, if pressed, will automatically generate the closed-captioning technology. The technology processes the audio feed using the speech-recognition technology used in the core voice search feature that has also been built into the Android voice search feature, the GOOG-411 phone search, and other products."
As long as they don't use GVoice Tech. (Score:5, Insightful)
Hey glum, Jen tonight. It's apologize for it, interrupting our conversation in early as this afternoon, yes, so I wanted to returning your call and you know check in with you further. Alright, hope you, I hope you're doing well done. Sounded like you, works but alright. Well I'll call me later. I'll talk to you soon. Bye.
Re: (Score:1)
"Hey bottoms ASAP. But on the religious anyways, call me back and I'm at my just call me back. Thank you."
"Hey what's going on man, this is in 2 mother and anything any cool commands. We have some. Please let me stop you have a cable is nothing important. Bye. "
"Hey Todd, on a bit. The Negro, then I put in an active on plan and payPal okay called up and have a couple (630) 440-6809. Okay bye. "
Re: (Score:1)
Re:As long as they don't use GVoice Tech. (Score:4, Funny)
My funniest one:
"Hello voice subscriber what. Hey if you few questions for you. They can feel me 6 like a year like 2 years ago to like forever. Go you came over and I was locked out of the password didnt know the password so much and we wanted. Anybody passed it. I don't know how you guys have a good i just took it out for the first time in years and it says your class is expired. I must be changed and I go to that the windows X P professional you went and dollar dishing whatever it is really old addition, windows 85,001 yet and it's give me a change. Faster screen and says, administrative, which is still around. Funny has got hold us for new password. I confirm you got through. I've any idea what the password again, 30, or if you're more than the who knows no idea what it would've been so if you tell me but sister for you know the next week, otherwise, I was gonna go out to confirm for some a long time, so if you should come pick the and a case."
Re: (Score:2)
How about you leave a voice message just reading that text? What's the result? Maybe it's some kind of "encryption" like ROT-13 but for voice messages. ;-)
Re:As long as they don't use GVoice Tech. (Score:4, Funny)
Meanwhile, speech recognition still fails, and google voice is just the worlds best demonstration of why
Re: (Score:2)
Re: (Score:3, Funny)
I know people for whom the examples in this thread would be accurate transcriptions...
Re: (Score:3, Informative)
Re: (Score:3, Insightful)
Re: (Score:2)
Re: (Score:2)
I doubt it'd make any difference.
Speech recognition technology is really still in its infancy... it's possible to get good results but only under the most controlled of circumstances... high quality microphone, no background noise, clear diction, recognition engine trained for the speaker, etc. Even then it may depend on what you're actually saying, since in the case of any ambiguity a smart recognition engine will fall back to grammatical analysis and word frequency counts etc to try to guess right.
The rea
Re: (Score:1)
Google translate has for the last couple of years been based on what is essentially database lookup rather than traditional grammatical/semantic analysis used by other translators such as Bablefish. When they made the switch the quality noticeably improved.
Basically they've got a huge database of snippets of language and their corresponding translations if different languages that was originally build from hand translated sources such as publically available United Nations documents, etc. When they translat
Re: (Score:2)
Ok, it's not MUCH like that, but it's enough to give me a few ideas. Speech recognition, data compression, and AI have a lot in common; they're all bottlenecked at the same point.
In theory, I wonder if an effective (lossy) text compression could be created that st
Re: (Score:2)
it doesn't matter. I just checked out a couple of high quality videos with a normal person speaking english without background noise..it was a jumbled mess of garbage. Another fine google production.
Re:As long as they don't use GVoice Tech. (Score:5, Informative)
Re: (Score:3, Interesting)
Some of the words hello and bye were dark, the rest were mostly light gray.
What
Re: (Score:2)
I'm never inviting you to my parties again.
Re: (Score:1, Interesting)
Isn’t that essentially what modem negotiation actually is? The two modems talking to each other, saying “hello” at length?
My goodness. It’s alive, and it can understand V.34...
Re: (Score:2)
Hey, how are you?
Not much going on. This new "Exchange Server" is such an asshole I wish he dies!
Yeah I know what you're sayin.. I think they're gonna throw me away soon:(
Oh well...here's the fax anyway. Hope to hear from you soon..Bye!
bip-bip bip bip bip bip-bip....
bib bip bip-bip...
Re: (Score:2)
Re: (Score:2)
> I wonder what accounts for the difference.
Some people sound that way on my answering machine (and others come across that way in person).
Re: (Score:2)
Except one friend, with a Texan accent, who usually is closer to 50% accurate.
Of course if you live in Texas and get called by mostly people with Texan accents you get 50% accuracy.
Re: (Score:2)
I wouldn't say the transcripts have been 99% accurate word for word for me, but I can almost always get the meaning. The one exception being a friend with a speech impediment.
The YouTube transcripts are pretty much useless from what I can tell.
Re: (Score:2)
Google Voice Voicemail Transcriptions! Now with Mad Gab [wikipedia.org] embedded puzzles!
The once and future Deaf accessible internet. (Score:2, Informative)
Huzzah! Now if we can just get subtitling/captioning on Netflix streams, the net will be accessible to the Deaf again.
Re:The once and future Deaf accessible internet. (Score:4, Insightful)
I almost never turn on my speakers and yet I find the internet quite accessible.
I'm not saying this isn't a great development. But to try to portray the internet as inaccessible to the deaf before now is ridiculous.
Re: (Score:1)
Netflix needs to get away from SilverDimGlow (Score:1)
This is why Google rocks ...and M$'s tarnished SilverDimGlow does not.
Srongly wish Netflix would realign themselves to use a youtube-like setup instead, but I strongly suspect M$ either threw them 'an offer they could not refuse', or this will become yet another mutual lock-in, like Intel_M$.
(Really irritated that I cannot, yet, watch Netflix from my Debian machines.)
Not only that (Score:1, Interesting)
They also changed the way videos are sent to the browser, many flash video players are failing because of that.
Re: (Score:2)
Wrong [opera.com].
Re: (Score:2)
From your second link:
So, they *can* be implemented using Javascript - You don't need any kind of plugins for that. And if you had read my link, you would see a link with an example of that [opera.com], which even provides you to a selection
Automatically generate the technology? (Score:5, Funny)
Talk about advanced! Back in my day, we had to pay engineers to generate technology for us!
Re: (Score:3, Funny)
Feeling feeling = Feeling.getFeeling(Feeling.LAUGHTER);
feeling.express();
Re: (Score:1)
Re: (Score:1)
And here I thought sprinkling 'self.' throughout my Python classes made me egotistical...
Re: (Score:2)
I can sell you a UML modeller which will do that. Just $100k per license. Believe me its cheap at the price. Let me demonstrate how you refactor the code. Just drag this little icon from here to here and the other little icons reorganise themselves around it. Buy this and you will never have to hire an engineer again!
Re: (Score:2)
I pay technology to generate engineers, you insensitive clod!
Re: (Score:3, Funny)
No! I’m not from Soviet Russia!
Noteable, but still very much experimental (Score:4, Informative)
However, it's a technology that is still relatively young. One hopes that applying it to Youtube will help Google improve the accuracy.
However, except for spoken videos with a native English speaker with absolutely no background noise, it's nothing more than a novelty at this point. Trying this on several videos not only yielded hilarious results, but delays of several seconds in some cases.
Re:Noteable, but still very much experimental (Score:5, Interesting)
This, if they allow for corrections it could be an incredibly huge resource of data for google. They'd end up with people spending millions of man hours teaching google how to do voice recognition. And having highly accurate voice recognition would be a boon for society generally.
Re:Noteable, but still very much experimental (Score:4, Insightful)
and then some company will come along and sue them for not being competitive because they have access to all this great data to make fantastic products other companies can't make.
Re: (Score:1)
Poor, poor Microsoft crying and complaining when they get punished for breaking the law.
Re: (Score:2)
Re: (Score:2)
Im the first to agree but then i saw Microsoft's attempt at voice recognition and its just as poor.
There needs to be significant improvements as whole until this stuff works properly, sadly i think it's still got a long way to go.
Accents play a big part, also the rate at people speak join words, you can tell youtube's voice recognition is good, but it doesn't keep up in those areas at all.
Interactive Transcripts vs. Captions (Score:2, Insightful)
Re: Interactive Transcripts vs. Captions (Score:1)
I can imagine Google would cache intermediate results, possibly improve those results from time to time, and create a good coupling to its own search engine. Other search engines might have to 'distill' searchable text from the video (=difficult?), so that Google can search YouTube video content better than other search engines? Just a guess, FWIW.
Re: (Score:2)
CC this... (Score:5, Funny)
Re: (Score:1)
Or this: http://www.youtube.com/watch?v=jH8gtrD4_C4 [youtube.com]
Re: (Score:1)
I was disappointed to see they don't have it for this: http://www.youtube.com/watch?v=t6FUR_nhGX8 [youtube.com]
(Seriously though - after searching through many videos, I've yet to find a single one that does have the option, other than one that someone posted above. "Most, if not all"? "All" is clearly not true, and it's hard to see justification for the "most", unless I'm being very unlucky in my search...)
Search? (Score:2, Insightful)
I haven't seen any mention of search, which seems odd. Google is adding captions to every YouTube video, and nobody is interested in whether you'll be able to search the captions or not? Seems to me like it could be quite useful to search the captions of every video on YouTube.
Re: (Score:1)
YouTube captions have been searchable since shortly after they were introduced.
Re: (Score:3, Informative)
Indeed; here's an example search showing caption results [youtube.com]. I'm just surprised that, of the several articles "covering" this story that I've seen, none have mentioned (even in passing) the applicability of universal captioning to search.
Re: (Score:2)
I think "all" is quite an exaggeration too. When looking for all videos with "a" in it (should be a lot) I get 283,000 results, while it normally results in "millions".
The search queries:
http://www.youtube.com/results?search_type=videos&closed_captions=1&uni=3&suggested_categories=10,24,1,15,25,28&search_query=a [youtube.com]
http://www.youtube.com/results?search_query=a&search_type=&aq=f [youtube.com]
All yore soup tittles Arnie belong two arse. (Score:1)
Wish commercial TV stations would use this tech! (Score:2, Interesting)
Wish this technology would be used by TV stations to provide 'sort of' subtitling for programs that don't have any. This could be helpful for deaf/hearing impaired viewers.
Where I live (Netherlands), there's a few public TV channels. Most programs on there are subtitled using a dedicated teletext page (888). For the bulk of commercial channels, there's also subtitles for things like prime time movies, and specific (popular) TV shows. But a lot of it is not, like average day time shows / late night docume
Re: (Score:2)
Proper subtitling needs humans, but come on, be honest. How much manpower does it actually require to subtitle something?
If its your native language its a matter of timing. Little else. If you're paying someone to be on the clock depending on the length of the program it might take anywhere from 30 minutes to a day for a long program. How much is a day's wages for even the lowest of budget infomercials?
if you're translating, you're probably not translating something new, and that means there are likely alre
Subtitling live TV (Score:2)
If you're paying someone to be on the clock depending on the length of the program it might take anywhere from 30 minutes to a day for a long program.
If the captioning takes longer than the program, you have to do it in advance. This rules out captioning news, sports, entertainment awards, and other live programs.
Re: (Score:2)
not really. Most lives things are actually shown on a tape delay. CC already exists for the news. but usually live programming is less concerned with exact timing and its often a constant stream of words, like with the news. I'm talking more about subbing a 2 hours movie and spending time making sure the captions line up perfectly with the dialog. It can be a tedious process. With a live program you just need someone who can type fast and accurately with a slight tape delay to check for any crazy mistakes.
Re: (Score:2)
Go to youtube RIGHT NOW for some laughs...for now (Score:2)
Re: (Score:2)
Take a look at this board game review:
http://www.youtube.com/watch?v=Uv6pIFgfa0U [youtube.com]
His name (Tom Vasel) appears to be consistently translated "oh come on now". What, don't they believe that's his name? He comes with surprising revelations such as "I'll be your next president" and wonderful nonsense like "but it is a ten-year period deduction gay".
About as good as I expected (Score:2)
Which is to say, pretty darned feeble. Clever work, but basically rubbish when compared to user expectation.
One of my favourite videos is this one (http://www.youtube.com/watch?v=yYAw79386WI [youtube.com]), dating from the '30s, about how differential gears work. The voice-over is that beautifully clear, precise American newsreader accent of the period, and there isn't any background music to confuse things. If anything should be a perfect candidate for a computer to analyse, it's this.
But the captions are worse than I'd
Re: (Score:2)
I think the solution is to let people submit corrections for the automatically generated subtitles.
This way we'll get a starting point, so the problem becomes more simple.
I am now trying to write the subtitles for one of my lectures, and I find it very very tiring and difficult. The greatest problem for me is in synchronizing audio with text - I have to manually indicate in which time period a particular text needs to be shown.
In other words, the bottleneck is not in figuring out what the words are, it is i
Re: (Score:1)
letting people submit corrections will work great until /b/ discovers it. Then every other caption will be "jews did 911" and "never gonna give you up". Remember Bucket the chatbot?
Re:I don't have the captions (Score:2)
I tried the video mentioned here, but it just tells me "Captions are not availabel". Strange.
Is it because I'm in Europe?
Because I use Firefox on Linux?
The video mentioned a few posts before that is even weirder: it seems to have captions, I can turn them on, but no captions are displayed.
"automatically generate the technology" (Score:2)
http://www.youtube.com/results?search_query=buzzword+bingo&search_type=&aq=2&oq=buzzwor [youtube.com] seems appropriate here.
Can they combine this with lip reading? (Score:2)
Could you combine this with the lip reading technology that was introduce to allow "voiceless" cell phone calls? http://www.ubergizmo.com/15/archives/2010/03/lip_reading_technology_unveiled.html [ubergizmo.com] Wouldn't that improve the accuracy for those scenes where the speakers mouth is visible?
Or how about using the subtitle tracks that are in a different language and reverse translating them to provide additional clues as to what the speaker might have been saying? It might help a little.
Which? (Score:2)
It's a Markov chain (Score:2)
Re: (Score:2)
Oh goody. (Score:1)
Let me guess, Youtube.ru (Score:5, Funny)
Re: (Score:1)
reads the caption and then produces the video?
Actually, a rather obvious extension to this technology would be to feed the captions to a machine translator and a text-to-speech synthesizer to produce e.g. Russian voice for a video for those Russians who don't comprehend spoken or written English.
Really? most? (Score:2)
The first 10 videos I've been to don't include it. Including suggested and front page vids.
Is this a metric most?
Re: (Score:2)
oh wait.. just found one.
What a train wreck. cheers google on yet another amazing product.
Here is what is actually said:
Hey Everyone So a lot of you may know that the Vancouver 2010 Winter Olympics are coming up
and here is the transcribed audio:
Everyone felt like a man of the I think every time he's had a winter olympics are coming
Just fantastic..wow..
This is certainly front page worthy.
I'm going to roll out a different product.
Basically the system will try to guess (not very accurately) how many words are
Good timing (Score:1)
This is excellent timing; I clicked on the link to a video on the previous /. story but my sound was not working. I thought, "man, I wish more videos were closed-captioned," not just for lazy people like me but also for the hearing impaired.
Finally it'll be easier for me to share these videos with my deaf and hard-of-hearing friends!
- RG>
Hitler Parodies the easy way (Score:2)
I like the "CC" feature... it makes it very simple to do those Hitler Downfall parodies... but I was surprised that I was the first to actually make one using the feature [youtube.com]. My video features closed captions for both the original German-to-English translation, and a Lost parody script. I also provide a handy download to a text-editable SRT file so others can make their own (does that make me a bad person?).
The nice thing is that you can add as many subtitle files as you like... and give each of them separate
Re: (Score:2)
On a side note, I see that YouTube has not gotten to any of my videos with this "automagic" speech recognition-generated closed captions. I was hoping they would try and make one for this video of mine [youtube.com], just to see what it generated.
Might mean videos could be searchable by content (Score:1)
An interesting upside to all this might be that, if Google keeps the dialog from youtube content in their searchable database, people may soon be able to search for videos via content.
Right now, I believe keywords need to be done, but the auto-captioning would remove that barrier, perhaps.
"Here's looking at you, kid."
2 Girls and 1 Cup Reactions (Score:2)
Does this mean they can now enjoy the 2 Girls and 1 Cup Reaction videos?
http://www.youtube.com/watch?v=ggaWaK5d23Y [youtube.com]
Is this Gaudi? (Score:2)
Now easier to catch unwanted content (Score:2, Interesting)
Soon (now?) they can generate captions of everything heard (or sung) in a video immediately after upload and match the captions against lyrics and transcriptions of copyrighted works or even just search them for specific keywords. Then they can flag those videos as possible copyright violations or even prevent them from being displayed until after being reviewed by someone.
I'm not saying captioning isn't a good idea, only that it can be used for more than just assisting the hard of hearing.