The "Loudness War" and the Future of Music 687
An anonymous reader notes an article up at IEEE Spectrum outlining the history and dangers of the accelerating tendency of music producers to increase the loudness and reduce the dynamic range of CDs. "The loudness war, what many audiophiles refer to as an assault on music (and ears), has been an open secret of the recording industry for nearly the past two decades and has garnered more attention in recent years as CDs have pushed the limits of loudness thanks to advances in digital technology. The 'war' refers to the competition among record companies to make louder and louder albums by compressing the dynamic range. But the loudness war could be doing more than simply pumping up the volume and angering aficionados — it could be responsible for halting technological advances in sound quality for years to come... From the mid 1980s to now, the average loudness of CDs increased by a factor of 10, and the peaks of songs are now one-tenth of what they used to be."
Example... (Score:5, Informative)
More info (Score:5, Informative)
Re:The alternative? (Score:5, Informative)
Oh that's right, you can't. You're right, it's not a tough choice is it?
Re:I have the solution (Score:5, Informative)
Re:I have the solution (Score:3, Informative)
Re:More info (Score:5, Informative)
Re:Wow, very informative... (Score:4, Informative)
Of course that is a lower fidelity signal because high fidelity means reconstructing also the dynamics of the original sound, so to audiophiles a compressed signal sounds crappy.
I think the war started with sound engineers overcompressing stuff out of experimentation (in dance music compression is an important aspect, for instance). That made louder records stand out better in radio programming (even if radio stations have good compressors themselves nowadays) and casual listening, especially on crappy audio equipment.
Once the ear has adjusted itself to the loud recording, the less loud one sounds a little worse.
Re:I have the solution (Score:4, Informative)
Yes, many people and many systems can't tell the difference. A casual listener listening to terrestrial radio in a car hasn't a chance in h*** of noticing; the degradation of the signal from other means makes this just noise. If you have a nice home system and actually enjoy LISTENING to the music then you probably can tell the difference.
This irks me almost as much as the whole "sell music in MP3 format" talk. MP3 is a lossy format, by definition, and is NOT the same music as recorded and particularly at 128k is very noticeable in any halfway decent environment. 256k is better, but I do NOT want a lossy format as my only choice for digital audio!
Re:What pisses me off (Score:5, Informative)
Re:The alternative? (Score:5, Informative)
Data compression should be clear - the raw audio data are processed in a way that they take less space on a storage medium or less time to push them over the Intertube. This is done either losslessly by purely mathematical means or lossy by using so-called psychoacoustic models that try either to remove those parts from the sound that the human brain won't really recognize (eg. because they're "buried" below some other sound playing at the same time), or simply store those parts with way less precision. Basically lossy compresison throws away some decimal places in the parts of the audio data you won't hear too well anyway.
Dynamic compression on the other hand simply reduces the dynamic range of the sound - it makes loud stuff quieter or, if you simultaneously push up the total volume, makes quiet stuff louder. This hasn't anything to do with digital audio data - it's a purely acoustic modification that's been in use in recording studios for decades now, sometimes reasonably, sometimes not
Interestingly dynamic compression for the sake of getting things louder and data compression are almost mutual exclusive - by increasing the average volume of the song and basically emphasizing every little detail you're making the music noisier and noiser - and white noise is the worst thing that can happen to data compression of any kind. And even psychoacoustic compression schemes are given a hard time when they've got to figure out which of all those things coming screaming at you are important and which aren't.
Re:I have the solution (Score:5, Informative)
You don't seem to understand it, but that is the crux of the loudness war. The local stations do not in fact crank the volume on commercials. That would be illegal. In fact what they do is compress the dynamic range of the audio, so the "apparent loudness" is increased. The peaks (which is how the FCC defines volume) are the same, but the RMS volume (essentially the average sound level and what our ear perceives as volume) is increased. Think about it, a CD is 16 bit, so the max volume is obviously 2^16=65536 for any particular data sample. So, they can't make the volume 2^17. What they can do, however, is compress the dynamic range, so instead of the average volume level to be at 4096, say, it is now 16483.
Commercials on TV suck, don't they. The audio is compressed to hell and back.
You've never listened to modern turntables (Score:5, Informative)
I have a set of flac music files of the latest White Stripes Album. The hiss is almost inaudible, there are no clicks, pops or any of the other crap you would hear on a mid 70's turn table.
Yes, the frequency range is nothing like a CD, but the dynamic range is SO much better. Plus on the CD version of the same album above is SO loud it actually clips (click sounds on loud points of the album).
It's a sad state of affairs when the Vinyl version of a record sounds better than the CD.
Re:The alternative? (Score:2, Informative)
Re:I have the solution (Score:4, Informative)
Is MP3 louder than uncompressed? (Score:5, Informative)
From your ear's point of view, then the folicles and cells that are tuned to the reatined frequencies, experience more accoustic energy at a given sound level.
On top of that, I suspect there are other effects as well. I suspect that MP3s may compand and decompand the music. Any mismatch between the compander and decompading codecs, or roundoff errors, might increase or decrease the dynamic range. Likewise the pyscho accoustic model might tinker with this as well.
The reason I think this is the case is that I always notice that when I play highly clipped music (e.g. Green day) through my ipod that the symbols and snare drums are actually slightly painful to the ears even when the overall volume is at low listening level.
Re:I have the solution (Score:3, Informative)
What do you think "volume" is? It's a perception of loudness, which is only roughly related to anything you can measure numerically. A 16-bit data sample on a CD tells you the sampled electrical voltage, which is definitely not loudness. The square of voltage is power, which is getting closer. So the 32,768 to 1 range of voltages (plus and minus) gives you a range of 1,073,741,824 to 1 in power, or as these things are normally measured, about 90 dB dynamic range. Perceived audio "loudness" is roughly logarithmic; that's why the dB number is useful.
A 1 dB change (26% in power ratio) is barely detectable. The threshold of pain [wikibooks.org] is up to 120 dB higher than the minimum detectable sound level. So even an uncompressed CD does not have enough dynamic range to capture what you might hear at a rock concert. (Just before you go deaf.)
Re:The alternative? (Score:3, Informative)
Re:Volume Leveling (Score:2, Informative)
The tech details (Score:2, Informative)
http://www.tcelectronic.com/media/lund_2004_disto
Although it's a couple of years old it's still very valid.
Re:How about adding compressors into the amps? (Score:1, Informative)
This is probably what is happening: (Score:2, Informative)
These are usually not audible, but when the decoder reconstructs the waveform, their removal will change the shape of the waveform; the formerly-clipped flat edges will have had the edges rounded off and may bulge slightly higher as they more closely resemble sinusoids.
This can actually sound better than the original clipped signal (as clipping is highly audible in double-blind tests and strains the ear) - except that the new "bulge" may go over what was previously full-scale, and unfortunately many MP3 decoders, particularly embedded ones like the iPods, will simply clip it again if it does.
For this reason, the LAME MP3 encoder actually applies a 1% volume reduction before compression in all the preset profiles. This is not within audible limits, and can never restore already-clipped waveforms, but helps to prevent any further clipping during decoding. Some other encoders do similar things.
It is preferable if such signals are left unclipped and instead, the signal is passed through a limiter that helps to avoid the harsh clipping sound (yet again) and leaves the sound as intact as possible (sound below full-scale in regions that are not clipping will be unaffected by a properly implemented digital limiter). For example, an audio playback chain in foobar2000 will typically do this as the final step of DSP.
This effect may be audible, and is often preferred to clipping. Additionally, thanks to the advent of ReplayGain: if a track has ReplayGain information (information on the perceived "loudness" of the track and/or album relative to a reference level; represented as how much the volume needs to be increased to reach the reference level; although with all modern recordings there is a considerable reduction, occasionally as much as -12dB), the highest peak level is recorded in the metadata, so the volume as a whole can be lowered in advance to try to preserve any high peaks.
Re:I have the solution (Score:3, Informative)
1. High Fidelity was never really that important to the enjoyment of music. When I listened to the Stooges on my parents' cheap Sylvania stereo, I wasn't really listening for the lovely interplay between the oboe and the English horn. I wanted volume and a big, big beat.
2. Popular music is much more rhythm and less melody than in the heyday of Hi-Fi, the 50's and 60's. When you've got producers who actually desire the lo-fi sound of some neolithic synth and then proceed to dirty it up with distortion and bit crushing, it's clearly not about getting a warm, natural sound.
3. Much music is listened to via headphones these days. If you're trying to get the purest recording and reproduction of acoustic instruments, a pair of earbuds isn't going to cut it. Not a whole lot of popular music today requires pure recording and reproduction of acoustic instruments anyway, so what's the difference?
4. Many of the great recordings of popular music were given a sort of distinction and personality by the type of production "mistakes" that are the bane of the hi-fi enthusiast. An example from the 2nd Rock Era is the cut Gimme Shelter by the Rolling Stones. There's a part on there where a tambourine will come in on the intro and it sent the VU meter way into the red, causing an ugly distortion that a "hi-fi" producer would have immediately thrown out and re-recorded. But the groove was there, a brilliant producer left it in, and now, whenever I hear that song it's that distorted tambourine that gives me the little shiver. Now, it's something that's sought out by producers. On the New Magnetic Wonder record by Apples in Stereo, there are cuts where some backing vocals are clearly recorded using a blown-out microphone. It sounds great to me, especially when I'm pedaling to work with my mp3 player cranked through my earbuds (yes, I know I'm taking a chance, using earphones when I'm riding in traffic, but it's such a joy that I accept the risk).
When I listen to Sir Georg Solti's recording of Parsifal, or Glenn Gould playing the Goldberg Variations, or Miles Davis In a Silent Way, I want a true and warm recording of the sound of the instruments. Air moving through a horn, or a string vibrating, or a piece of wood striking a skin. If Ceelo Green is a Soul Machine or The Books The Lemon of Pink is on my box, how would I know if the re-recording of the sample of Bernie Worrell's string-synth from P-Funk Connection is a "true and warm recording" or not? All I know is it makes the juice flow. That's good enough for me.
iZotope Ozone (Score:3, Informative)
The problem is not so much the use of such filters but the fact that they are used to optimize recordings for the very mediocre equipment most people use. Subtle bass sounds are simply lost; as are quiet high pitched sounds, because cheap equipment doesn't do anything with this information anyway. To counter this, the trick is to boost the volume of such sounds (relative to the rest) and to shift the spectrum away from very high or very low sounds. Like manipulating photos generally leads to loss of detail and undesired artifacts, manipulating sound results in similar loss of detail and distortion of what remains. Commercial records are edited to the limit of crappy mp3 players and radio. It's the equivalent of boosting a photo's contrast so much that most detail is drowned out to make it look good on a good old matrix printer. The psychological effect is similar as well: we humans appreciate contrast in all sorts of ways and the matrix printer doesn't do grays very well anyway. Unfortunately if you have a high end inkjet printer, such photos don't look much better than on the matrix printer because there is no extra detail anymore.
When used properly however, manipulating sound can improve quality significantly. Many expensive highend amplifiers basically contain lots of dsps to 'improve' the sound and do some restauration work on the distorted signal on the CD (e.g. by interpolating and reinserting detail that was lost in the mastering process). Old fashioned valve based amplifiers are all about sound distortion (in a pleasing way). This is no different than what happens in the studios except that the result would be much better if the studios didn't throw out so much detail. This point can be demonstrated easily by playing back some sixties/seventies recordings which have much less aggressive audio manipulation.
Re:The alternative? (Score:2, Informative)
Oh that's right, you can't. You're right, it's not a tough choice is it?
I have my own Protools based home recording studio. I get to experiment first hand with this sort of heavy limiting. Using a good limiter plugin (in my case a Waves L2) it's easy to make anything sound many times as loud as the original recording without introducing artifacts, but in addition to permanently loosing the dynamics, it becomes almost fatiguing to even listen to...and that's nothing compared to what mastering engineers are doing (against their own wishes by the way) at the request of their customers (the record companies). It really is criminal. The fact is that this sort of stupidity was impossible in the days of vinyl...the needle would have jumped out of the groove if anyone attempted it.
Re:It's more than just music (Score:1, Informative)
A good number to use for optimistic estimates of 35mm film/digital equivalence is 100 line pairs/mm, or 5080dpi. That's around 35M pixels.
You need some very very good glass to get that full-frame, and you need a tripod to get that even in the center. Going over 100 lp/mm is both expensive and rather technical.
Actual obtainable results given typical usage are going to me more like 50 lp/mm, which comes out to around 9M pixels.
The current crop of 15-20 megapixel DSLRs are about as good as the format is going to get, in terms of usable resolution.
Re:The alternative? (Score:2, Informative)
I spent a year working for an absolute wizard at audio stuff; he worked at Bell Labs for 26 years and helped invent MP3. So I am not guessing at anything I say here.
Sound-level compression is not that hard to do in real time. There are several ways to do it. The best way is to do a pure digital EQ using a computer model of how the human ear perceives loudness, and that feature is shipping today as part of Windows Vista (look for "loudness equalization" or something like that, I don't know what it is because I don't run Vista at all). Doing loudness EQ this way is roughly as computationally expensive as decompressing MP3, i.e. not too expensive by modern standards.
Most sound-level compressors strictly use the power of the music to approximate the loudness of the music. This works perfectly when the music is sine tones, but doesn't work so well for real signals. Some parts of the music that hit your ear on a bunch of different frequencies will sound louder than their power would suggest; and these will be over-boosted by the sound-level compressor. (Most radio stations use a compressor on everything they broadcast, and you can hear "spitting" sounds when people say words with sibilants. Listen to a DJ saying "summer sales" and you will often hear spitting or hissing noises on the "s" sounds.) Some power-based compressors sound better than others (some audio engineers swear by really old-school equipment) but the digital loudness equalization really sounds the best.
I hope your idea comes to pass, and music gets encoded with a full dynamic range, and just has sound-level compression cues encoded as well.
But I also put hope in the Internet itself. With actual, physical media like CDs it would be too hard to sell multiple different versions, but with audio files sitting on a server for download, it would be very easy to sell the mass-market version and the "audiophile" version that has full dynamic range.
Re:I have the solution (Score:2, Informative)