Become a fan of Slashdot on Facebook

Audio Compression Primer 236

Posted by CmdrTaco on Thursday January 13, 2005 @03:31PM from the do-you-hear-what-i-hear dept.

Hack Jandy writes "For those of you with a little extra time this afternoon, check out Sudhian's primer to all things concerning audio compression. The article details everything from DRM to CRC matrixes (with a healthy dosage of Ogg)."

This discussion has been archived. No new comments can be posted.

Audio Compression Primer

Search 236 Comments Log In/Create an Account

Comments Filter:

Re:Virtually dismisses lossy compression (Score:2, Informative)

by tsanth ( 619234 ) writes: on Thursday January 13, 2005 @03:43PM (#11351616)

Given the topics in the audio section [sudhian.com] (it has an audio section!), the site seems to lean more towards audiophiles.

I don't agree with the dismissal of lossy algorithms either, but I think it makes sense given the context.

Parent Share
twitter facebook
Re:Is FLAC worth it? (Score:2, Informative)

by jasoncc ( 754385 ) writes: on Thursday January 13, 2005 @03:49PM (#11351681)

I use FLAC because converting from a lossy format to another lossy format can produce crappy results. If I choose a lossy format for all my audio and then I need the audio to be in some other lossy format, I might be screwed.

You might choose Ogg for your audio then sometime in the future, a new lossy format sweeps the industry. Your Ogg files might not convert well to the new format.

and besides...Disk is Cheap!

Parent Share
twitter facebook
being pedantic, but... (Score:3, Informative)

by demonbug ( 309515 ) writes: on Thursday January 13, 2005 @03:49PM (#11351686) Journal

Trying to transmit audio data with uncompressed audio or video is not the easiest task. After all, even an audio CD contains data that transmits at 1400kb/s

Shouldn't that be 1200 kb/s? 150 KB/s * 8 = 1200 kb/s, right? Or is the 150 KB/s figure I'm using incorrect (I could have sworn that was the 1x CD speed)?

Share
twitter facebook
AAC (Score:4, Informative)

by sometwo ( 53041 ) writes: on Thursday January 13, 2005 @03:49PM (#11351688)

So what about AAC used by Apple in their music store?

I did a little googling and found this (http://www.teamcombooks.com/mp3handbook/13.htm [teamcombooks.com]):

AAC (Advanced Audio Coding) is not a MPEG layer, although it is based on a psycho-acoustic model. Sometimes referred to as MP4, AAC provides significantly better quality at lower bit-rates than MP3. AAC was developed under MPEG-2 and also exists under MPEG-4.

AAC supports a wider range of sampling rates (from 8 kHz to 96 kHz) and up to 48 audio channels, plus up to 15 auxiliary low frequency enhancement channels and up to 15 embedded data streams. AAC works at bit rates from 8 kbps for mono speech and up to in excess of 320 kbps for high-quality audio. Three profiles of AAC provide varying levels of complexity and scalability.

AAC software is much more expensive to license than MP3 because the companies that hold related patents decided to keep a tighter reign on it. Most AAC software is geared towards professional applications and secure music distribution systems, so it may be a while before you see AAC in consumer-oriented products.

Share
twitter facebook
FLAC will live forever (Score:3, Informative)

by parvenu74 ( 310712 ) writes: on Thursday January 13, 2005 @03:50PM (#11351696)

Because the code is open source, FLAC will be around forever and available on whatever OS/Platform you want to use it on if you feel like compiling the software.

Another reason it's going to be around and much more prevalent as time goes on is that the compression is so good and the speed/resource usage figures are so attractive. When I rip CD's to FLAC I am limited to 40x by my burner (CPU utilization is around 20-25%). When I rip the same CD to ogg, I top out under 30X because the processor has reached 100% utilization.

Fast. Free. Efficient. Frugal with the CPU. What else do you need?

Parent Share
twitter facebook
Re:being pedantic, but... (Score:2, Informative)

by stratjakt ( 596332 ) writes: on Thursday January 13, 2005 @03:53PM (#11351728) Journal

441000hz*16bits*2 channels = 1411200 bits per second, 1400 kb/s

The 150KB number is for CD-ROM data storage, the gap between the two data rates is for the extra error detection and correction.

Parent Share
twitter facebook
Re:128K should be enough for everyone (Score:3, Informative)

by wfberg ( 24378 ) writes: on Thursday January 13, 2005 @03:54PM (#11351739)

FM Radio is far from CD quality hence there isnt really a need to use very high bitrate MP3s or whatever

Or consider this; since FM radio has a limited range of frequencies that come across well, songs that are intended to be widely played on FM radio (e.g. Britney Spear's latest "hit" song) are actually engineered to sound best in those frequencies. With the end result that when you hear Britney Spears on the radio, the track sounds just like it does on the CD.

Meanwhile, quality music, lovingly mixed onto CD by people who actually give a damn, sounds like crap on the radio..

In other words; if you can't hear the difference between 128kbps and higher, it might just be that you're listening to mass produced music.

As for musicians preferring 128kbps? Well, sound engineers usually don't sit on stage with zillion Watt speakers right next to their fragile precious ears for a reason..

Me, I have crap taste in music AND I'm tonedeaf, so whatever, 128kbps all the way! ;-)

(MPEG artifacts in video drive me nuts, though)

Parent Share
twitter facebook
Re:being pedantic, but... (Score:2, Informative)

by stratjakt ( 596332 ) writes: on Thursday January 13, 2005 @03:55PM (#11351751) Journal

Err, that would be error codes and positional information.

There's even a little more room, in the subcode channels where one can hide the data for CD+G (karaoke) or CD-TEXT.

Parent Share
twitter facebook
more algorithms (Score:5, Informative)

by barik ( 160226 ) writes: on Thursday January 13, 2005 @03:59PM (#11351802) Homepage

While the article is a primer, I was a little disappointed in the algorithmic treatment given in the article itself. Right now I know of two excellent free publications: Introduction to Sound Processing [mondo-estremo.com] and The Sounding Object [mondo-estremo.com], which both treat the theoretical, DSP side of things. Any other resources that Slashdot readers can recommend for those who are interested in the subject of audio compression and representation?

Share
twitter facebook
Re:The actual meaning of lossless ?? Any clues? (Score:2, Informative)

by stratjakt ( 596332 ) writes: on Thursday January 13, 2005 @04:04PM (#11351867) Journal

If it's lossless, you should be able to take digital file A, compress it into compressed file B, and then if you uncompress B to get A', then A' = A.

That is, the checksums for A and A' should match, etc.

That's how I define mathematically lossless.

Whatever this asshat is on about double blind and testing and all that, has more to do with the ability of his FLAC playing equipment to sound the same as his CD player, which is a whole 'nother ball of wax altogether.

Parent Share
twitter facebook
Re:being pedantic, but... (Score:3, Informative)

by Piquan ( 49943 ) writes: on Thursday January 13, 2005 @04:09PM (#11351939)

Shouldn't that be 1200 kb/s? 150 KB/s * 8 = 1200 kb/s, right? Or is the 150 KB/s figure I'm using incorrect (I could have sworn that was the 1x CD speed)?
Data CDs are 150 KB/s at 1x, but you're missing an important difference between data and audio CDs.
CD sectors are 2352 bytes (I'm ignoring subchannels here). Data CDs have 2048 data bytes, plus 304 bytes of error-correction data, so every bit comes off perfectly. Audio CDs have no error correction, so they use all 2352 bytes for audio data (on the assumption that a few bits missed won't hurt). That means that audio data is moved 14.8% faster (in b/s) than 9660 data. 1200*1.148 = 1378.
Another calculation you can use instead: 44100 samples/sec * 2 channels/sample * 16 bits/channel = 1411200 bits/sec, or 1378 K/s.

Parent Share
twitter facebook
Re:Virtually dismisses lossy compression (Score:3, Informative)

by Sebastopol ( 189276 ) writes: on Thursday January 13, 2005 @04:14PM (#11352018) Homepage

Yes, I noticed the article is 3 PAGES LONG! It makes only passing reference to other codecs. Not much of a primer, and it didn't take the entire afternoon to read, it to 5 minutes.

Did I miss a crucial link or something?

Parent Share
twitter facebook
Re:One sad bit.. (Score:1, Informative)

by Anonymous Coward writes: on Thursday January 13, 2005 @04:16PM (#11352041)

Vorbis decoder is and has been done for a long time. Like other codecs, tweaks can always be made to the encoder to produce better results by using different psychoacoustic models, etc. As long as the output still follows spec, the decoder will still decode just fine. This is why your crappy MP3's from 1997 still play today, and fancy MP3's from today will still play on those old sound players from 1997. As long as the encoder follows spec, the decoder will always be able to decode it properly.

Parent Share
twitter facebook
Re:128K should be enough for everyone (Score:4, Informative)

by pthisis ( 27352 ) writes: on Thursday January 13, 2005 @04:21PM (#11352119) Homepage Journal

especially when listening to music on hi-quality speakers a la Bose

Bose is doesn't make high-quality speakers, they make expensive speakers that don't perform nearly as well as alternatives (for instance, the Acoustimass satellites use crappy paper cones that perform poorly in the upper frequencies). A $300 pair of B&W DM302's will thrash anything Bose makes soundly for sound quality. Also investigate Hale, Thiel, or Paradigm. If you really want to spend thousands, spend it on Magnepan (Magneplanar 1.6Q) or Vandersteen (2ce signature) or the higher end speakers from the companies I already mentioned. But those DM302's are good enough to be highly rated by places like Stereophile magazine and they're an incredible deal.

If you really want a bunch of little satellite speakers, Energy makes a much better sounding (and somewhat cheaper) system like that. I hear from people I trust that Tannoy makes an incredible one as well, but I haven't heard it.

Parent Share
twitter facebook
Actually, you hear quantization distortion (Score:2, Informative)

by cogito ergo blog ( 830437 ) writes: on Thursday January 13, 2005 @04:27PM (#11352182)

(Mod to -3, nitpicking)

The MDCT in itself is actually lossless. Any distortion you notice is most likely introduced by the quantization applied post MDCT during compression.

Parent Share
twitter facebook
ARRRG! He gets Nyquist WRONG! (Score:4, Informative)

by wowbagger ( 69688 ) writes: on Thursday January 13, 2005 @04:36PM (#11352338) Homepage Journal

According to the "Nyquist Theorem," you need to have twice as many digital samples as the frequency of the analog signal you are trying to represent to have enough data to accurately build it.

WRONG!

Nyquist's criterion is "You must have at least twice as many samples as the largest BANDWIDTH of the signal in order to correctly reconstruct it."

You can take a 10.7 MHz signal, and sample it at 10000 samples per second, and correctly reconstruct it, so long as the signal is guaranteed to be bandwidth limited to 10.7 MHz +/- 2.5 kHz. This is often done in software defined radio to aquire the signal from the intermediate frequency (IF) of the analog front end.

You also have to have an appropriate reconstruction filter at the output of the system in order to correctly recover the signal - if you don't have the right reconstruction filter, you will NOT reconstruct the signal correctly.

You also have to take into account the effects of any signal modulation - take a 20 kHz sine wave, and burst it for 10 msec, and you widen the bandwidth of the signal by about 100 Hz (depending upon the exact shape of the burst - a perfect square burst will widen the signal as a sinc function and will, in effect, increase the bandwidth to infinity, which is why square bursts are generally Considered Harmful in communications work).

Also, you don't oversample a signal in time to account for "rounding errors" - you oversample in time because the frequency response of sampling a system in time introduces a sinc response in frequency - by moving the sampling rate up you reduce the impact of this response on the recovered signal's frequency response. You also greately ease the requirements on the reconstruction filter - the filter can be wider (have fewer poles in the transfer function - thus fewer parts needed).

Share
twitter facebook
iPods don't play .ogg (Score:2, Informative)

by me at werk ( 836328 ) writes: on Thursday January 13, 2005 @05:41PM (#11352776) Homepage Journal
From Apple - iPod - Technical Specifications [apple.com]:
- Audio formats supported: AAC (16 to 320 Kbps), MP3 (32 to 320 Kbps), MP3 VBR, Audible, AIFF, Apple Lossless and WAV
- Upgradable firmware enables support for future audio formats
The second bullet leaving the possibility there, but the page lists it as currently (meaning iPod users now, popularity etc) not supporting it.
Parent Share
twitter facebook
"VBR" 320kbps (Score:3, Informative)

by silverfuck ( 743326 ) writes: <dan.farmer@gmail.c3.1415926om minus pi> on Thursday January 13, 2005 @05:42PM (#11352785) Homepage

I know that even large radio stations use 128Kbit sampling frequency.

Sampling frequency would typically be 44.1KHz, bitrate would be 128kbps. Also, FM radio quality (with good reception) compares to about 96kbps well-encoded mp3, so there's not much point in them recording higher except for archival purposes.

I have switched from 128K to VBR 320K

You should be using LAME to encode, and LAME only goes up to 320kbps (blade for instance goes up to 384kbps, but is much lower quality), ergo you can only have 320kbps CBR, not VBR.

And to everybody else out there who complains about background noise, you should be extracting digitally from the CD!

flac doesn't seem to have come far enough yet for me (500+ albums is a lot of diskspace if it's around 300MB/album), but to my ears on my equipment (Klipsch £250 (pound sterling if that doesn't come out) speakers, cheapo SB Audigy2 soundcard), lame --preset standard (around 200kbps VBR) sounds damn near perceptual transparency.

Parent Share
twitter facebook
Re:more algorithms (Score:3, Informative)

by Hal-9001 ( 43188 ) writes: on Thursday January 13, 2005 @06:28PM (#11353242) Homepage Journal
Any other resources that Slashdot readers can recommend for those who are interested in the subject of audio compression and representation?
- An older but good technical survey of digital audio compression, including MP3, is Davis Yen Pan, "Digital Audio Compression," Digital Technical Journal (Spring 1993). (PDF [iocon.com])
- Some other technical reference material on MP3 is also available on the Digital Audio Systems website. [iocon.com]
- A more recent survey of perceptual coding of audio, which covers more recent formats like AAC, is Painter and Spanias, "Perceptual Coding of Digital Audio," Proc. IEEE (April 2000). (PDF [asu.edu])
- Ogg Vorbis is documented on the Xiph.org website, but I found the documentation [xiph.org] to be lacking when read from a signal processing perspective. Christopher Montgomery provides a better description from that perspective in a Slashdot interview from 2000. [slashdot.org] I found another good description in this thread [hydrogenaudio.org] in the hydrogenaudio forums--it hyperlinks a good block diagram [port5.com] of the encoding process.
Parent Share
twitter facebook
Re:ARRRG! He gets Nyquist WRONG! (Score:3, Informative)

by Kiryat Malachi ( 177258 ) writes: on Thursday January 13, 2005 @08:25PM (#11354489) Journal

And you've just described "beating". Imagine that, instead of that 10k sine at 20khz sampling, you have a 9.99kHz sine at 20k sampling. The point on the waveform that you're sampling is going to slowly change from cycle to cycle, and you're going to wind up with a 9.99kHz sine wave amplitude modulating - "beating" - at 0.01kHz.

Parent Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Audio Compression Primer 236

Audio Compression Primer More Login

Audio Compression Primer

Re:Virtually dismisses lossy compression (Score:2, Informative)

Re:Is FLAC worth it? (Score:2, Informative)

being pedantic, but... (Score:3, Informative)

AAC (Score:4, Informative)

FLAC will live forever (Score:3, Informative)

Re:being pedantic, but... (Score:2, Informative)

Re:128K should be enough for everyone (Score:3, Informative)

Re:being pedantic, but... (Score:2, Informative)

more algorithms (Score:5, Informative)

Re:The actual meaning of lossless ?? Any clues? (Score:2, Informative)

Re:being pedantic, but... (Score:3, Informative)

Re:Virtually dismisses lossy compression (Score:3, Informative)

Re:One sad bit.. (Score:1, Informative)

Re:128K should be enough for everyone (Score:4, Informative)

Actually, you hear quantization distortion (Score:2, Informative)

ARRRG! He gets Nyquist WRONG! (Score:4, Informative)

iPods don't play .ogg (Score:2, Informative)

"VBR" 320kbps (Score:3, Informative)

Re:more algorithms (Score:3, Informative)

Re:ARRRG! He gets Nyquist WRONG! (Score:3, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot