Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Communications Media Music Open Source Technology Your Rights Online

Next-Gen Low-Latency Open Codec Beats HE-AAC 166

Aldenissin writes "From the Xiph.org developers, Opus is a non-patent encumbered codec designed for interactive usages, such as VoIP, telepresence, and remote jamming, that require very low latency. When they started working on Opus (then known as CELT), they used the slogan 'Why can't your telephone sound as good as your stereo?', and they weren't kidding. Now, test results demonstrate that Opus's performance against HE-AAC, one of the strongest (but highest-latency) codecs at this bitrate, bests the quality of two of the most popular and respected encoders for the format, on the majority of individual audio samples receiving a higher average score overall. Hydrogenaudio conducted a 64kbit/sec multiformat listening test including Opus, aoTuV Vorbis, two HE-AAC encoders, and a 48kbit/sec AAC-LC low anchor. Comparing 30 diverse samples using the highly sensitive ABC/HR methodology, Opus is running with 22.5ms of total latency but the codec can go as low as 5ms."
This discussion has been archived. No new comments can be posted.

Next-Gen Low-Latency Open Codec Beats HE-AAC

Comments Filter:
  • by MadAhab ( 40080 ) <slasher@@@ahab...com> on Thursday April 14, 2011 @08:30PM (#35824028) Homepage Journal

    This will be perfect for my next level beats.

  • Patent free? Or royalty free?

    • by jmv ( 93421 ) on Thursday April 14, 2011 @09:01PM (#35824264) Homepage

      To be exact, there *are* patents, but they will be available without fee in a way that is compatible with FOSS licences such as the GPL. The main idea behind these patents is that your license terminates if you sue someone by claiming Opus infringes your patents. Almost like a copyleft, but for patents (of course the details are different because copyright != patent).

  • remote jamming? (Score:5, Informative)

    by mirix ( 1649853 ) on Thursday April 14, 2011 @08:37PM (#35824084)

    and remote jamming

    Took me a while to figure out they meant in a band. I was wondering how they were going to jam some sort of signal with this codec.

    • by qpqp ( 1969898 )
      Thanks, took me much less time, because of you! We're jammin'...
    • by gaelfx ( 1111115 )

      With raspberry jam of course, much to the lament of Dark Helmet.

    • by bosef1 ( 208943 )

      That would have been a pretty awesome demonstration of the codec, though.

    • by jd ( 1658 )

      I thought "remote jamming" meant jamming of remotes. Which would be awesome to do, next time the neighbors channel-surf at high volume.

    • by glwtta ( 532858 )
      I assumed they meant making preserves in an isolated or inaccessible location.
    • and remote jamming

      Took me a while to figure out they meant in a band. I was wondering how they were going to jam some sort of signal with this codec.

      Jamming a signal wouldn't be hard with any codec. You just have to broadcast the output on the right frequency with a sufficient power output. What you broadcast doesn't matter - whether it's the Linux Source code or the output of a codec. It's just the fact that you are actually broadcasting. ;-)

  • Is that 5~22.5ms of latency on top of network latency?

    • Re:Total Latency (Score:5, Informative)

      by jmv ( 93421 ) on Thursday April 14, 2011 @08:53PM (#35824198) Homepage

      Yes, 5 to 22.5 ms is the algorithmic delay of the codec. By comparison, codecs like AAC/MP3/Vorbis have more than 100 ms algorithmic delay (you need to give the encoder side more than 100 ms of audio before the decoder side gives you any audio back).

      • When you are dealing with audio signals in the home, low latency can be needed too. If you are doing something like playing prerecorded video then no, the system can find out the delays of the screen, audio, codecs, etc and insert delays as needed to sync it all up. However not if you are doing something live, like games. That's the reason for stuff like Dolby Digital Live and DTS Interactive. They are made so that you can get low latency encoding so the sound from a game console syncs up with the video.

        It

      • I'm curious what's the problem with Speex for voice transmission? (A non-rhetorical question.)
  • by adolf ( 21054 ) <flodadolf@gmail.com> on Thursday April 14, 2011 @08:52PM (#35824192) Journal

    Who cares what codec is being used for my VoIP phone at home or on my desk, when anyone I call is still most likely to be connected over the PSTN with g.711 or g.723, or (far worse) a cell phone?

    And don't get me wrong: I want to care; I really do. And maybe I did care, at one point. I was going to build an Asterisk system for home -- I even collected some of the hardware to make it work.

    But I stopped caring when the boy got old enough to properly want a cell phone, the wife got a cell phone, and I had a cell phone. After that, I dropped the home phone line altogether, since it was just a waste of money.

    I have no interest, at this moment, in having any sort of telephony tied to my premises.

    And while I could, I suppose, run some manner of VoIP client on my Droid over cellular, I think that's a complete non-starter at the moment: I had trouble earlier today getting a 64kbps MP3 to stream correctly over 3G Verizon (even though I controlled both ends of the stream), but that was just an inconvenience.

    It'd be a lot more than simply inconvenient if my phone calls were that spotty. I don't care how good it sounds if it doesn't work.

    Is there any good and practical use for this new codec?

    • by nog_lorp ( 896553 ) on Thursday April 14, 2011 @09:00PM (#35824254)
      Lol what? You're crazy. I suppose it is never worth inventing a new codec ever, since everyone uses old codecs! /fail argument
      • by adolf ( 21054 )

        I think my point was more that it is currently seldom worth using a new codec, since the folks in the middle are using old codecs. And when I say old, I mean it: Many decades old, in some cases.

        I can feed pristine 96KHz 24-bit audio into the PSTN, and still will never get anything better than g.711 out of the other end, because it gets ruined in the middle.

        • by horza ( 87255 )

          You should try getting a mobile with wifi built in. Also very useful abroad to make VoIP calls from your mobile to avoid exorbitant roaming charges. Eventually you'll be able to get home femto cells so you can direct your normal cell calls via your asterisk box so don't throw away that hardware just yet.

          By the way, I think you are wrong. It gets ruined at the end link, not in the middle. In the middle is just IP packets just being passed along trunk lines, there is no codec.

          Phillip.

    • Disregarding the bandwidth your service provider may or may not provide you, VoIP clients on mobile devices are difficult or impossible to use due to the reliance of even modern VoIP protocols such as SIP on RTP which uses UDP for media transport, and every 3G provider I've ever seen deploys wide scale NATing to all their connected devices. They could make a legitimate argument for a lack of addresses, but there's kind of a conflict of interest there too.
      • But, to reply more to the parent than to you: it's not like you're just going to be using this over wireless internet; some people actually have DSL or better connections with less then 40ms of latency; at rates like that a codec latency of 4ms is still 20% of the total latency. At that kind of latency 45ms you could play music with someone without driving yourself crazy because you both sound like your lagging behind the beat (you WILL both appear to be lagging to each other, but 45ms is a small enough am
        • by adolf ( 21054 )

          So, it's something that might be useful for musicians. Maybe.

          100ms of total, round-trip, end-to-end-to-end latency (remember to count both hypothetical DSL connections) is the same as two musicians trying to play together when they are about 56 feet apart. It might be practical, but it doesn't sound very fun for many types of informal "jam"-oriented music: There's a reason the bass player often stands next to the drummer, and it's usually not because he wants more hearing damage.

          I just listened to some B

      • by tlhIngan ( 30335 )

        Disregarding the bandwidth your service provider may or may not provide you, VoIP clients on mobile devices are difficult or impossible to use due to the reliance of even modern VoIP protocols such as SIP on RTP which uses UDP for media transport, and every 3G provider I've ever seen deploys wide scale NATing to all their connected devices. They could make a legitimate argument for a lack of addresses, but there's kind of a conflict of interest there too.

        The solution to that is to quit paying for "unlimited

    • by EdIII ( 1114411 )

      This is absolutely important.

      There are a lot of PSTN providers using g711 or g723. Quite a number of them offer g729.

      The big difference here is:

      1) Not being limited to 8khz sampling rate which sucks
      2) Not being limited by the phone manufacturers to g729 for the better bandwidth usage
      3) It HANDLES MUSIC.

      If you are using g729 in a system hold music is horrible especially when you are trying to use your own or a radio. This is because g729 works best on speech not music. Music sounds messed up when transcod

      • by afidel ( 530433 )
        UMTS uses a max 12.2Kbps for voice channels and since LTE allows seamless fallback to UMTS towers I can't see how that changes until providers go pure LTE and remove the voice terminal class from phones (a decade from now, maybe?). Btw the difference in bandwidth between a UMTS voice channel and the bandwidth this codec was tested at is the same as the difference between this codec and 320Kbps CBR just to put into perspective how much bandwidth this thing is using compared to conventional cellphone calls, a
    • Is there any good and practical use for this new codec?

      Yes. Live audio applications such as digital radio mics. Before the only viable option was ADPCM which slew-rate limits horribly and sounds awful. Either that or find enough RF bandwidth to send uncompressed PCM. For live applications 3ms delay is needed or drummers start playing out of time etc. If it can be tweaked to less than 5ms then it's got a future in this application.

      • by adolf ( 21054 )

        Good example.

        And to think that for all this time, I've been giving guitarists two choices: Either plug in with a real wire, or prepare to be strangled with that cheap-shit wireless kit.

        Sometimes they plug in, and other times the show gets delayed while we hunt around looking for enough air to blow up a backup guitar player. (It's their head that's the problem -- it takes forever to inflate it to the correct size.)

        Perhaps this new codec will help save a guitarist.

  • if the codec cant reliably do dtmf detection, then its no good -- i'll stick with ulaw disallow=all allow=ulaw
    • by parlancex ( 1322105 ) on Thursday April 14, 2011 @10:02PM (#35824712)
      You do realize that most modern VoIP hardware / software supports out of band DTMF? In fact, the most modern software demands it.
      • by afidel ( 530433 )
        I was about to say the same thing, out of band is the only way to make DTMF work reliably with VoIP IME.
        • Out of band doesnt work with analog lines.
          • Of course it doesn't, but then, neither does the codec. Whatever VoIP codec you are using is transcoded along with the out of band DTMF at your PSTN gateway into G.711 with the DTMF put inband, whether your connection to the PSTN happens to be via POTS or PRI.
      • by gr8_phk ( 621180 )
        Is that because the codecs can't reproduce it well enough?
        • No, it's mostly because when you're writing software to use DTMF interaction, you don't want to be bothered with the filters and algorithms to extract the data from a audio channel (which is unreliable with noise, echo, feedback, etc.) when it can be given accurately in a very easy to parse form out of band. The transition to inband DTMF and out of band DTMF usually occurs at the PSTN gateway which bridges your VoIP network which can support out of band DTMF to the regular telephone system where all you hav
      • Sure, for SIP to SIP, but not for SIP to PSTN, which is where most calls terminate. ( http://www.voip-info.org/wiki/view/Asterisk+DTMF [voip-info.org] )
  • by jd ( 1658 ) <imipak@ y a hoo.com> on Thursday April 14, 2011 @09:28PM (#35824468) Homepage Journal

    ...it can't have been "then known as CELT" since it is a merge of two codecs of which CELT is one and SILK is the other. It's good that it's an IETF standard as that will help some with adoption. It will also help some with getting other implementations. (Hell, Dirac is a great codec for video but because it's not a recognized standard for anything it's not getting used.)

  • by shish ( 588640 )
    Am I misunderstanding, or is the headline "open codec designed for voip is slightly better for voip than closed codec designed for music"? How does it compare to the other voip codecs?
    • by jmv ( 93421 )

      No, it's "open codec designed for voip is slightly better for *music* than closed codec designed for music". In other words, this test showed Opus (slightly) beating HE-AAC at what HE-AAC does best. Testing on a voip scenario would have been pointless because of the huge delay HE-AAC has.

    • by Skuto ( 171945 )

      Am I misunderstanding, or is the headline "open codec designed for voip is slightly better for voip than closed codec designed for music"? How does it compare to the other voip codecs?

      The test wasn't about VoIP but about music. The VoIP codec beat the others at their game.

  • What about lower bitrates? HE-AAC is designed for low-bitrate audio, and 64 Kbps is right on the outside edge of where HE-AAC is useful. 24-32 Kbps is where HE-AAC really shines, and that's where stuff really gets impressive.

    • Cell phones, ISDN, and all the like operate at 64kbps.
      Most users DSL lines have plenty more than 64kbps both directions, so 64kbps is also a safe bet for VoIP applications.

      If hydrogene audio want to prove that this codec is a good replacement for the codecs currently used in phone, it has to be tested on the bandwith usually associated with phones.

      • by Guspaz ( 556486 )

        Umm, you're pulling that out of your ass. I don't think any modern cellphone system uses 64Kbps.

        GSM uses 5.6Kbps for half-rate, 13 Kbps for full rate, and 12.2 Kbps for enhanced full rate. Most other voice codecs operate in that range, although they can usually do far better than the 3.1 KHz that you get from GSM.

        ISDN at 64Kbps is irrelevant; that's the data rate, and if you're trying to run a VoIP system over there, unless you need fax support (there are better alternatives), then you're not dedicating you

  • I tried to comment on this a while ago but the latency messed me up.

On the eighth day, God created FORTRAN.

Working...