Next-Gen Low-Latency Open Codec Beats HE-AAC 166
Aldenissin writes "From the Xiph.org developers, Opus is a non-patent encumbered codec designed for interactive usages, such as VoIP, telepresence, and remote jamming, that require very low latency. When they started working on Opus (then known as CELT), they used the slogan 'Why can't your telephone sound as good as your stereo?', and they weren't kidding. Now, test results demonstrate that Opus's performance against HE-AAC, one of the strongest (but highest-latency) codecs at this bitrate, bests the quality of two of the most popular and respected encoders for the format, on the majority of individual audio samples receiving a higher average score overall. Hydrogenaudio conducted a 64kbit/sec multiformat listening test including Opus, aoTuV Vorbis, two HE-AAC encoders, and a 48kbit/sec AAC-LC low anchor. Comparing 30 diverse samples using the highly sensitive ABC/HR methodology, Opus is running with 22.5ms of total latency but the codec can go as low as 5ms."
Next level beats (Score:4, Funny)
This will be perfect for my next level beats.
Re:Next level beats (Score:5, Insightful)
Perhaps they could switch to "has not yet been challenged in court for any possible patent infringement". But who would use a codec like that? Besides Google, of course.
Companies do this all the time. Anyone shipping H.264 has this risk, as the patent pool provides zero guarantee no outside patents will pop up.
Actually, anyone shipping anything at all has this risk.
Realistically, it's more like "does not infringe any known patents, or has licenses for them, and is not infringing any other patents that we could find in a patent search".
Re: (Score:2)
Everyone. There might well be patents on H264 that folks outside the MPEG-LA hold. Software patents are a minefield for everyone involved.
And this 'SILK' codec? (Score:1)
Patent free? Or royalty free?
Re:And this 'SILK' codec? (Score:5, Informative)
To be exact, there *are* patents, but they will be available without fee in a way that is compatible with FOSS licences such as the GPL. The main idea behind these patents is that your license terminates if you sue someone by claiming Opus infringes your patents. Almost like a copyleft, but for patents (of course the details are different because copyright != patent).
Re:And this 'SILK' codec? (Score:4, Informative)
This is the license for the "old" SILK codec. The patent licenses for Opus has nothing to do with that. Please read them:
Xiph.Org IPR statement: https://datatracker.ietf.org/ipr/1524/ [ietf.org]
Broadcom IPR statement: https://datatracker.ietf.org/ipr/1526/ [ietf.org]
Skype IPR statement: https://datatracker.ietf.org/ipr/1525/ [ietf.org]
Re:And this 'SILK' codec? (Score:5, Informative)
What makes you say that? If you find a real issue, please raise it -- either on the mailing list: codec@ietf.org, or to me privately (jmvalin@jmvalin.ca). Skype is on the good side on this one. The technology they have contributed is very useful and they're open about resolving any licensing issue.
Re: (Score:2)
validates patents? Patents are already valid in law. I don't really understand your position because what you're saying is that we should ignore the law where valid. You could make a similar claim about being against GPL because it validates copyright. It's a silly position to take that puts you outside reality.
Re: (Score:2)
As someone stated, it validates only in the way that GPL does. Without a patent, then it could be "patented" under your nose. Can you fight it an win? Probably, but who wants to waste resources fighting? Refusing to play the game only works if you're not already in it. You can't stop in the middle, or you defacto forfeit/lose.
remote jamming? (Score:5, Informative)
and remote jamming
Took me a while to figure out they meant in a band. I was wondering how they were going to jam some sort of signal with this codec.
Re: (Score:2)
Re: (Score:1)
With raspberry jam of course, much to the lament of Dark Helmet.
Re: (Score:1)
That would have been a pretty awesome demonstration of the codec, though.
Re: (Score:2)
I thought "remote jamming" meant jamming of remotes. Which would be awesome to do, next time the neighbors channel-surf at high volume.
Re: (Score:3)
Re: (Score:2)
and remote jamming
Took me a while to figure out they meant in a band. I was wondering how they were going to jam some sort of signal with this codec.
Jamming a signal wouldn't be hard with any codec. You just have to broadcast the output on the right frequency with a sufficient power output. What you broadcast doesn't matter - whether it's the Linux Source code or the output of a codec. It's just the fact that you are actually broadcasting. ;-)
Total Latency (Score:2)
Is that 5~22.5ms of latency on top of network latency?
Re:Total Latency (Score:5, Informative)
Yes, 5 to 22.5 ms is the algorithmic delay of the codec. By comparison, codecs like AAC/MP3/Vorbis have more than 100 ms algorithmic delay (you need to give the encoder side more than 100 ms of audio before the decoder side gives you any audio back).
And that isn't just important online (Score:3)
When you are dealing with audio signals in the home, low latency can be needed too. If you are doing something like playing prerecorded video then no, the system can find out the delays of the screen, audio, codecs, etc and insert delays as needed to sync it all up. However not if you are doing something live, like games. That's the reason for stuff like Dolby Digital Live and DTS Interactive. They are made so that you can get low latency encoding so the sound from a game console syncs up with the video.
It
Speex? (Score:2)
Re: (Score:2)
Re: (Score:3)
There are "custom modes" that can do that. With those you can go as low as 2.5 ms. The only down side is that you can't switch frame size dynamically when you use these custom modes.
Re: (Score:2)
I wonder if this is the same concept as the decoding delay in video, where there's a difference between decode time and presentation time, perhaps due to re-ordering of frames? Sometimes you can get higher quality doing this, but latency is obviously undesirable in an interactive application like telephony.
Re: (Score:2)
Yes, it's the same thing. A low-delay codec like Opus minimizes the decode-to-presentation delay, from something around 94ms for HE-AAC to 22ms (or lower) for Opus.
Again in video (say H.264) terms, this is a bit like a Baseline Profile (without delay-inducing B-frames) beating a High Profile codec.
Re: (Score:2)
If someone migrates from an analogue source the lag starts becoming apparent. Between the codec, the transmission system, and the decoder on the other side it can create a noticeable delay in a conversation. People not used to this will find themselves interrupting each other.
It's much the same as a digital radio dilemma. Analogue 2way conversations sounded quite reasonable if the transmitter and one of the receivers were in the same room. Worst case if the volume was cranked to 11 you end up with feedback,
Why users care... (Score:2)
...is at the top of the first Opus/CELT demo page:
http://people.xiph.org/~xiphmont/demo/celt/demo.html [xiph.org]
The low latency makes more interactive applications possible. By way of illustration, the total algorithmic delay of an Opus or CELT stream is approximately equivalent to the time it takes sound to travel from you to someone standing five feet away.
Re: (Score:2)
That is a good illustration, thanks.
That's all fine and dandy, but.... (Score:3, Insightful)
Who cares what codec is being used for my VoIP phone at home or on my desk, when anyone I call is still most likely to be connected over the PSTN with g.711 or g.723, or (far worse) a cell phone?
And don't get me wrong: I want to care; I really do. And maybe I did care, at one point. I was going to build an Asterisk system for home -- I even collected some of the hardware to make it work.
But I stopped caring when the boy got old enough to properly want a cell phone, the wife got a cell phone, and I had a cell phone. After that, I dropped the home phone line altogether, since it was just a waste of money.
I have no interest, at this moment, in having any sort of telephony tied to my premises.
And while I could, I suppose, run some manner of VoIP client on my Droid over cellular, I think that's a complete non-starter at the moment: I had trouble earlier today getting a 64kbps MP3 to stream correctly over 3G Verizon (even though I controlled both ends of the stream), but that was just an inconvenience.
It'd be a lot more than simply inconvenient if my phone calls were that spotty. I don't care how good it sounds if it doesn't work.
Is there any good and practical use for this new codec?
Re:That's all fine and dandy, but.... (Score:5, Insightful)
Re: (Score:2)
I think my point was more that it is currently seldom worth using a new codec, since the folks in the middle are using old codecs. And when I say old, I mean it: Many decades old, in some cases.
I can feed pristine 96KHz 24-bit audio into the PSTN, and still will never get anything better than g.711 out of the other end, because it gets ruined in the middle.
Re: (Score:2)
You should try getting a mobile with wifi built in. Also very useful abroad to make VoIP calls from your mobile to avoid exorbitant roaming charges. Eventually you'll be able to get home femto cells so you can direct your normal cell calls via your asterisk box so don't throw away that hardware just yet.
By the way, I think you are wrong. It gets ruined at the end link, not in the middle. In the middle is just IP packets just being passed along trunk lines, there is no codec.
Phillip.
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
So, it's something that might be useful for musicians. Maybe.
100ms of total, round-trip, end-to-end-to-end latency (remember to count both hypothetical DSL connections) is the same as two musicians trying to play together when they are about 56 feet apart. It might be practical, but it doesn't sound very fun for many types of informal "jam"-oriented music: There's a reason the bass player often stands next to the drummer, and it's usually not because he wants more hearing damage.
I just listened to some B
Re: (Score:2)
Bad example. You ears are made to recognize time differences between left and right. Overall latency is much less annoying.
Good example. It lets me hear things as I might if I were a musician, playing with someone else. Perhaps you're not familiar enough with the Beatles to understand why I chose them for this listening experiment: Much of their earlier music has instruments and vocals panned either hard left or hard right, with little or nothing centered at all. It's the closest thing to a raw multitra
Re: (Score:2)
The solution to that is to quit paying for "unlimited
Re: (Score:2)
This is absolutely important.
There are a lot of PSTN providers using g711 or g723. Quite a number of them offer g729.
The big difference here is:
1) Not being limited to 8khz sampling rate which sucks
2) Not being limited by the phone manufacturers to g729 for the better bandwidth usage
3) It HANDLES MUSIC.
If you are using g729 in a system hold music is horrible especially when you are trying to use your own or a radio. This is because g729 works best on speech not music. Music sounds messed up when transcod
Re: (Score:2)
Re: (Score:2)
Is there any good and practical use for this new codec?
Yes. Live audio applications such as digital radio mics. Before the only viable option was ADPCM which slew-rate limits horribly and sounds awful. Either that or find enough RF bandwidth to send uncompressed PCM. For live applications 3ms delay is needed or drummers start playing out of time etc. If it can be tweaked to less than 5ms then it's got a future in this application.
Re: (Score:2)
Good example.
And to think that for all this time, I've been giving guitarists two choices: Either plug in with a real wire, or prepare to be strangled with that cheap-shit wireless kit.
Sometimes they plug in, and other times the show gets delayed while we hunt around looking for enough air to blow up a backup guitar player. (It's their head that's the problem -- it takes forever to inflate it to the correct size.)
Perhaps this new codec will help save a guitarist.
Re: (Score:3)
Massive online game chat: WoW, Mumble/Murmur...
As if WoW is the most latency-sensitive thing in the world. (It's not.)
Free WiFi audio/video telephony: Ekiga...
Ok, sure. It doesn't improve my life at all (with the stated constraints about what I think I should care about), but why not. I guess it does this one thing that folks have already been doing, and has a chance at sounding better in the process.
(I can't be disagreeable all the time, and hey, at least I learned about new unpronounceable open-source
ok but how is dtmf detection? (Score:2)
Re:ok but how is dtmf detection? (Score:4, Interesting)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Dual-Tongued Female Mutants
Forget Wikipedia. I'm registering dtfm.xxx.
Re: (Score:2)
It's DTMF, Dual-tone multi-frequency signaling. It's the different sounds you hear if you hit the dial keys on your phone.
Technically... (Score:3)
...it can't have been "then known as CELT" since it is a merge of two codecs of which CELT is one and SILK is the other. It's good that it's an IETF standard as that will help some with adoption. It will also help some with getting other implementations. (Hell, Dirac is a great codec for video but because it's not a recognized standard for anything it's not getting used.)
Not so— (Score:3, Informative)
Skype will release their patents under a free software compatible license if the codec is standardized by the IETF: https://datatracker.ietf.org/ipr/1525/
Re: (Score:2)
Mod parent up please! Definitely note worthy...
Re: (Score:2)
Re: (Score:2)
I would guess that this codec is very likely to gain hardware support, particularly as it's being standardized by the IETF *and* Broadcom seems to have been contributing to the development of it.
This would mean that during Skype calls, the decoding no longer has to be done in software, so battery time when using Skype on a mobile device could increase a lot.
I'm just guessing!
So? (Score:2)
Re: (Score:2)
No, it's "open codec designed for voip is slightly better for *music* than closed codec designed for music". In other words, this test showed Opus (slightly) beating HE-AAC at what HE-AAC does best. Testing on a voip scenario would have been pointless because of the huge delay HE-AAC has.
Re: (Score:2)
Am I misunderstanding, or is the headline "open codec designed for voip is slightly better for voip than closed codec designed for music"? How does it compare to the other voip codecs?
The test wasn't about VoIP but about music. The VoIP codec beat the others at their game.
That's all fine and dandy, but... (Score:2)
What about lower bitrates? HE-AAC is designed for low-bitrate audio, and 64 Kbps is right on the outside edge of where HE-AAC is useful. 24-32 Kbps is where HE-AAC really shines, and that's where stuff really gets impressive.
64kbps = standart (Score:2)
Cell phones, ISDN, and all the like operate at 64kbps.
Most users DSL lines have plenty more than 64kbps both directions, so 64kbps is also a safe bet for VoIP applications.
If hydrogene audio want to prove that this codec is a good replacement for the codecs currently used in phone, it has to be tested on the bandwith usually associated with phones.
Re: (Score:2)
Umm, you're pulling that out of your ass. I don't think any modern cellphone system uses 64Kbps.
GSM uses 5.6Kbps for half-rate, 13 Kbps for full rate, and 12.2 Kbps for enhanced full rate. Most other voice codecs operate in that range, although they can usually do far better than the 3.1 KHz that you get from GSM.
ISDN at 64Kbps is irrelevant; that's the data rate, and if you're trying to run a VoIP system over there, unless you need fax support (there are better alternatives), then you're not dedicating you
Re: (Score:2)
I've done so, extensively. The results at 24Kbps and 32Kbps are very impressive. I'm not claiming that they're CD-quality by any stretch, only that it's a great leap forward over competing codecs. The idea that a dialup internet user can stream audio with that kind of quality is impressive, and for internet radio broadcasters in a bandwidth/budget crunch, it can be a lifesaver, especially if it's primarily talk radio.
Laterncy (Score:2)
I tried to comment on this a while ago but the latency messed me up.
Re: (Score:1)
...in your opinion.
I dunno (Score:1)
I think HE-MAN is better than HE-ACC.
Re: (Score:2)
Ah damnit, I was gonna say it was powered by The Power of Greyskull.
Re:HE-AAC is worse than LE-AAC in terms of quality (Score:5, Insightful)
While your rant appears informative if not insightful on its face, it is completely missing the point.
This is a test of audio codecs at low bitrates.
I don't know what this "LE-AAC" is you speak of (and rather suspect you don't either) but AAC-LC was actually in this test, as the low anchor.
At these bitrates (~64kbps) HE-AAC (despite its "low-accuracy" as you put it) is perceptually better sounding than AAC-LC. Lossy audio codecs (even the LE-AAC [sic] encoder in Apple's Core Audio framework you love) can only be judged by how they sound, not how they look. "Accuracy" is not a metric very worthy of discussion.
Re: (Score:3, Informative)
Re: (Score:2)
Re: (Score:2)
Is that you NBCA?
Re: (Score:2)
Re:HE-AAC is worse than LE-AAC in terms of quality (Score:4, Interesting)
Anyway, if this really IS an improvement over HE-AAC, which uses some very techniques, I'll be extremely impressed, and quite pleased that it's patent free.
Re:HE-AAC is worse than LE-AAC in terms of quality (Score:4, Interesting)
The sad thing is it shouldn't be better than HE-AAC. Being low latency does tend to mean one is better at the kind of time-domain issues many find so objectionable, but outside that OPUS is really packing a MUCH smaller toolkit than HE-AAC.
This is really egg on AAC's face, IMHO, and quite the upset. OPUS is so immature the bitstream isn't even stable yet.
Re: (Score:2)
Re: (Score:2)
Re:HE-AAC is worse than LE-AAC in terms of quality (Score:5, Interesting)
If we were talking about a 96 kb/s test, I'd agree with you. But at 64 kb/s, HE-AAC sounds much better than AAC-LC. The guys who organized this test picked the best AAC implementation they could find at the rate the test was run at.
Re: (Score:3)
Yes, but for digitally re-un-non-illossless compression I would go with the Foobar Audio Framework.
Re: (Score:2)
Parent post is complete bullshit. HE-AAC greatly outperforms LC-AAC at 64kbps. This can be seen in several previous listening tests, including the ITU ones that standardized the format itself.
Re: (Score:2)
This is slashdot, where opinions bear more credibility than point-of-fact
Re: (Score:2)
Not sure what AC is talking about, seems to be what the site says it does. I downloaded a sound file. This is the first time I have heard the codec, and it does sound extremely good. I don't know much about these matters, but I liked that it was "open", and could be relevant to my interests somehow.
Re: (Score:2)
Re: (Score:3)
Even back when I used to play games online with voice chat
Imagine Rock Band with voice chat. Or imagine actually making real music with voice chat.
Re:sorry for being dense, but... (Score:5, Insightful)
Re: (Score:2)
The only practical difference between gaming VOIP and Skype is having to hit a push-to-talk button. Latency issues like people stepping on each other crop up in gaming VOIP in much the same way that they pop up in high-latency cell phone or Skype conversations.
Re: (Score:2)
For simple half-duplex systems like gaming, more lag is not really noticeable.
The only practical difference between gaming VOIP and Skype is having to hit a push-to-talk button. Latency issues like people stepping on each other crop up in gaming VOIP in much the same way that they pop up in high-latency cell phone or Skype conversations.
Not really. You're not (typically) having a back-and-forth conversation while gaming, just announcing your information and clearing the channel. So there is little difference, conversationally speaking, if your burst is delayed by half a second or so. It's not a conversation, it's a series of announcements. With noticeable lag in a phone call, however, you'll find yourself (and the caller/callee) tripping over each other's sentence beginnings as you both play the "no, after you" as the lag causes you (and y
Re: (Score:2)
Re: (Score:3)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Good to know, I use Mumble and was curious if it might plan to take advantage. My understanding is that Mumble is encrypted, but all the more reason for lower latency?
Re: (Score:2)
Re: (Score:2)
I have heard it both ways, and the documentation I have seen doesn't make it all that clear.
Re: (Score:2)
Right, that is the best I could find as well when I was convincing a friend who is how you say, "paranoy" it would work. (Thanks.) Surely there is something in the source that mentions it, but one shouldn't have to dig to discover a "feature", imho... I think more people would use it if it was more widely known as fact.
Re: (Score:2)
Re: (Score:2)
Ah yes, I remember seeing that before. I think I did the same as you. Good Job... *book marked*
Re: (Score:2)
Sorry about double posting, but having spoke to some people in Freenode's #mumble channel, I got some more info...
They are [ietf.org] looking at Opus.
Screenshot showing encryption [devs-on.net]. (Thanks dD0t)
Re: (Score:2)
And once more, they just updated this [sourceforge.net] page after I mentioned the confusion... so it's settled!
Re: (Score:2)
If you would have RTFA, you would have seen that the actual p value was smaller than 0.000 (99.99%) , not smaller than 0.050 (95%). This even accounts for all comparisons being performed, so the famous xkcd green jelly beans comic does not apply.
Opus is better than Vorbis (p=0.000)
Opus is better than Nero_HE-AAC (p=0.000)
Opus is better than Apple_HE-AAC (p=0.000)
You would also have seen that your "higher bitrate" comment