Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Music Media Science

The "Loudness War" and the Future of Music 687

An anonymous reader notes an article up at IEEE Spectrum outlining the history and dangers of the accelerating tendency of music producers to increase the loudness and reduce the dynamic range of CDs. "The loudness war, what many audiophiles refer to as an assault on music (and ears), has been an open secret of the recording industry for nearly the past two decades and has garnered more attention in recent years as CDs have pushed the limits of loudness thanks to advances in digital technology. The 'war' refers to the competition among record companies to make louder and louder albums by compressing the dynamic range. But the loudness war could be doing more than simply pumping up the volume and angering aficionados — it could be responsible for halting technological advances in sound quality for years to come... From the mid 1980s to now, the average loudness of CDs increased by a factor of 10, and the peaks of songs are now one-tenth of what they used to be."
This discussion has been archived. No new comments can be posted.

The "Loudness War" and the Future of Music

Comments Filter:
  • Example... (Score:5, Informative)

    by Suicidal Gir ( 939232 ) on Thursday August 23, 2007 @08:49AM (#20328857)
    Here's [youtube.com] a good video outlining what the record companies have been doing.
  • More info (Score:5, Informative)

    by Rob T Firefly ( 844560 ) on Thursday August 23, 2007 @08:49AM (#20328859) Homepage Journal
    Wikipedia has a decent article on the Loudness War, [wikipedia.org] complete with interesting graphics of the same song from newer and older releases. [wikipedia.org]
  • Re:The alternative? (Score:5, Informative)

    by Anonymous Coward on Thursday August 23, 2007 @08:53AM (#20328899)
    Which knob do you adjust to increase the dynamic range and re-add the lost information?

    Oh that's right, you can't. You're right, it's not a tough choice is it?
  • by Maxx169 ( 920414 ) on Thursday August 23, 2007 @08:56AM (#20328933)
    Wrong kind of compression. http://en.wikipedia.org/wiki/Dynamic_range_compres sion [wikipedia.org]
  • by Andrewkov ( 140579 ) on Thursday August 23, 2007 @08:58AM (#20328943)
    Different kind of compression. This compression evens out the volume, so you can boost the overall volume level without clipping. Totally different thing than data compression.
  • Re:More info (Score:5, Informative)

    by olip ( 203119 ) on Thursday August 23, 2007 @09:02AM (#20328985)
    And Slashdot had a decent discussion on the Loudness War [slashdot.org] 3 months ago, complete with the YouTube demo [youtube.com].
  • by marcello_dl ( 667940 ) on Thursday August 23, 2007 @09:29AM (#20329321) Homepage Journal
    The benefit is that a louder signal is perceived as a better signal by the ear. Since our sensitivity is not equally distributed along all frequencies a louder signal "acquires" more frequency range.

    Of course that is a lower fidelity signal because high fidelity means reconstructing also the dynamics of the original sound, so to audiophiles a compressed signal sounds crappy.

    I think the war started with sound engineers overcompressing stuff out of experimentation (in dance music compression is an important aspect, for instance). That made louder records stand out better in radio programming (even if radio stations have good compressors themselves nowadays) and casual listening, especially on crappy audio equipment.

    Once the ear has adjusted itself to the loud recording, the less loud one sounds a little worse.
  • by jrsp ( 513795 ) on Thursday August 23, 2007 @09:30AM (#20329331)
    I wouldn't say it "evens out the volume". It makes the gap between quiet and loud much smaller (a "thinner" signal, if you will) then pumps amplitude into the whole thing (volume level) so that you don't get much/any clipping. The result is a louder signal that is NOT the same as what was recorded.

    Yes, many people and many systems can't tell the difference. A casual listener listening to terrestrial radio in a car hasn't a chance in h*** of noticing; the degradation of the signal from other means makes this just noise. If you have a nice home system and actually enjoy LISTENING to the music then you probably can tell the difference.

    This irks me almost as much as the whole "sell music in MP3 format" talk. MP3 is a lossy format, by definition, and is NOT the same music as recorded and particularly at 128k is very noticeable in any halfway decent environment. 256k is better, but I do NOT want a lossy format as my only choice for digital audio!
  • by Pope ( 17780 ) on Thursday August 23, 2007 @09:42AM (#20329489)
    The expression is "Hear hear [wikipedia.org]" you dumbass. Although in this case the expression is "HEAR! HEAR!"
  • Re:The alternative? (Score:5, Informative)

    by kb ( 43460 ) on Thursday August 23, 2007 @09:47AM (#20329551) Homepage Journal
    Not at all. Like many other people you're confusing dynamic compresssion (what the article is about) with data compression (what YouTube and generally MP3 does).

    Data compression should be clear - the raw audio data are processed in a way that they take less space on a storage medium or less time to push them over the Intertube. This is done either losslessly by purely mathematical means or lossy by using so-called psychoacoustic models that try either to remove those parts from the sound that the human brain won't really recognize (eg. because they're "buried" below some other sound playing at the same time), or simply store those parts with way less precision. Basically lossy compresison throws away some decimal places in the parts of the audio data you won't hear too well anyway.

    Dynamic compression on the other hand simply reduces the dynamic range of the sound - it makes loud stuff quieter or, if you simultaneously push up the total volume, makes quiet stuff louder. This hasn't anything to do with digital audio data - it's a purely acoustic modification that's been in use in recording studios for decades now, sometimes reasonably, sometimes not :)

    Interestingly dynamic compression for the sake of getting things louder and data compression are almost mutual exclusive - by increasing the average volume of the song and basically emphasizing every little detail you're making the music noisier and noiser - and white noise is the worst thing that can happen to data compression of any kind. And even psychoacoustic compression schemes are given a hard time when they've got to figure out which of all those things coming screaming at you are important and which aren't.
  • by crgrace ( 220738 ) on Thursday August 23, 2007 @09:48AM (#20329569)
    It's illegal to crank commercial volumes, but every local station does it anyway - advertisers love it. I have to turn down the volume every time a stupid loud commercial comes on.

    You don't seem to understand it, but that is the crux of the loudness war. The local stations do not in fact crank the volume on commercials. That would be illegal. In fact what they do is compress the dynamic range of the audio, so the "apparent loudness" is increased. The peaks (which is how the FCC defines volume) are the same, but the RMS volume (essentially the average sound level and what our ear perceives as volume) is increased. Think about it, a CD is 16 bit, so the max volume is obviously 2^16=65536 for any particular data sample. So, they can't make the volume 2^17. What they can do, however, is compress the dynamic range, so instead of the average volume level to be at 4096, say, it is now 16483.

    Commercials on TV suck, don't they. The audio is compressed to hell and back.
  • by Danathar ( 267989 ) on Thursday August 23, 2007 @10:21AM (#20330019) Journal
    Have you listened to a modern pressed record played on a modern (made this year) turntable?

    I have a set of flac music files of the latest White Stripes Album. The hiss is almost inaudible, there are no clicks, pops or any of the other crap you would hear on a mid 70's turn table.

    Yes, the frequency range is nothing like a CD, but the dynamic range is SO much better. Plus on the CD version of the same album above is SO loud it actually clips (click sounds on loud points of the album).

    It's a sad state of affairs when the Vinyl version of a record sounds better than the CD.
  • Re:The alternative? (Score:2, Informative)

    by bobschneider8 ( 878023 ) on Thursday August 23, 2007 @10:32AM (#20330175)
    Actually, the risk of hearing loss is proportional to both volume level and the time you're exposed. Louder but very short peaks but a lower average level (ie, like natural sound) is usually less risky than a higher average level but lower peaks.
  • by dkf ( 304284 ) <donal.k.fellows@manchester.ac.uk> on Thursday August 23, 2007 @10:42AM (#20330331) Homepage

    Think about it, a CD is 16 bit, so the max volume is obviously 2^16=65536 for any particular data sample.
    Actually that's the peak-to-peak height, but since you're actually storing a waveform you have to halve that value (to 2^15=32768 or there abouts) to get the real maximum (digital) amplitude. Your other points are correct though.
  • by goombah99 ( 560566 ) on Thursday August 23, 2007 @10:51AM (#20330493)
    Perfect timing on this article. I was just wondering to myself if MP3s are actually louder than the original music. Now I have to explain what "louder" means here, it's effectively dynamic range, but not quite. The layman's description of how MP3s work is that the look for soft frequencies that will be pyschoaccoustically masked by the loud parts of other frequencies, and then information to encode those is removed. Thus in effect one is filtering out some of the spectrum selectively. But that means two things 1) loss of signal energy and 2) loss of some noise at the deleted spectrum. The loss of energy could be compensated for by raising the volume. And that compbined with the lower noise, means higher dynamic range at the retained frequencies.

    From your ear's point of view, then the folicles and cells that are tuned to the reatined frequencies, experience more accoustic energy at a given sound level.

    On top of that, I suspect there are other effects as well. I suspect that MP3s may compand and decompand the music. Any mismatch between the compander and decompading codecs, or roundoff errors, might increase or decrease the dynamic range. Likewise the pyscho accoustic model might tinker with this as well.

    The reason I think this is the case is that I always notice that when I play highly clipped music (e.g. Green day) through my ipod that the symbols and snare drums are actually slightly painful to the ears even when the overall volume is at low listening level.

  • by bromoseltzer ( 23292 ) on Thursday August 23, 2007 @11:14AM (#20330775) Homepage Journal
    In fact what they do is compress the dynamic range of the audio, so the "apparent loudness" is increased. The peaks (which is how the FCC defines volume) are the same, but the RMS volume (essentially the average sound level and what our ear perceives as volume) is increased. Think about it, a CD is 16 bit, so the max volume is obviously 2^16=65536 for any particular data sample. So, they can't make the volume 2^17. What they can do, however, is compress the dynamic range, so instead of the average volume level to be at 4096, say, it is now 16483.

    What do you think "volume" is? It's a perception of loudness, which is only roughly related to anything you can measure numerically. A 16-bit data sample on a CD tells you the sampled electrical voltage, which is definitely not loudness. The square of voltage is power, which is getting closer. So the 32,768 to 1 range of voltages (plus and minus) gives you a range of 1,073,741,824 to 1 in power, or as these things are normally measured, about 90 dB dynamic range. Perceived audio "loudness" is roughly logarithmic; that's why the dB number is useful.

    A 1 dB change (26% in power ratio) is barely detectable. The threshold of pain [wikibooks.org] is up to 120 dB higher than the minimum detectable sound level. So even an uncompressed CD does not have enough dynamic range to capture what you might hear at a rock concert. (Just before you go deaf.)

  • Re:The alternative? (Score:3, Informative)

    by I Like Pudding ( 323363 ) on Thursday August 23, 2007 @11:17AM (#20330817)
    Compression is one of the most important parts of audio engineering. Doing it dynamically with a shitty low-power digital algorithm results in a MUCH larger drop in audio quality than having the guy in the studio whip out his n thousand dollar vintage valve (vacuum tube) unit. The mastering engineers are also ninjas at squashing the dynamic range as much as possible while doing the smallest amount of damage.
  • Re:Volume Leveling (Score:2, Informative)

    by iainl ( 136759 ) on Thursday August 23, 2007 @11:29AM (#20330989)
    iTunes will normalise volume levels for you, and Audacity will actually renormalise the raw file. You could try one of those.
  • The tech details (Score:2, Informative)

    by __aaittv7720 ( 85420 ) on Thursday August 23, 2007 @11:46AM (#20331175)
    Here's a very good paper on the subject from TC Electronic's tech library:

    http://www.tcelectronic.com/media/lund_2004_distor tion_tmt20.pdf [tcelectronic.com]

    Although it's a couple of years old it's still very valid.

  • by Anonymous Coward on Thursday August 23, 2007 @12:57PM (#20332183)
    Well it would probably sound like crap, depending on the price. The best compressors are VERY expensive, and the cheapos make your audio sound like tin-can music coming through a telephone. (Well, maybe not that bad, but you get my point.)
  • by Anonymous Coward on Thursday August 23, 2007 @01:16PM (#20332443)
    Having been processed through a lossy codec, it is possible that clipped waveforms may become further clipped. As the waveform is reconstructed by the decoder, not all of the original frequency coefficients are present; some, which the encoder did not deem audible, will have been discarded; especially high frequencies above the 19KHz range, which MP3 in particular cannot encode well.

    These are usually not audible, but when the decoder reconstructs the waveform, their removal will change the shape of the waveform; the formerly-clipped flat edges will have had the edges rounded off and may bulge slightly higher as they more closely resemble sinusoids.

    This can actually sound better than the original clipped signal (as clipping is highly audible in double-blind tests and strains the ear) - except that the new "bulge" may go over what was previously full-scale, and unfortunately many MP3 decoders, particularly embedded ones like the iPods, will simply clip it again if it does.

    For this reason, the LAME MP3 encoder actually applies a 1% volume reduction before compression in all the preset profiles. This is not within audible limits, and can never restore already-clipped waveforms, but helps to prevent any further clipping during decoding. Some other encoders do similar things.

    It is preferable if such signals are left unclipped and instead, the signal is passed through a limiter that helps to avoid the harsh clipping sound (yet again) and leaves the sound as intact as possible (sound below full-scale in regions that are not clipping will be unaffected by a properly implemented digital limiter). For example, an audio playback chain in foobar2000 will typically do this as the final step of DSP.

    This effect may be audible, and is often preferred to clipping. Additionally, thanks to the advent of ReplayGain: if a track has ReplayGain information (information on the perceived "loudness" of the track and/or album relative to a reference level; represented as how much the volume needs to be increased to reach the reference level; although with all modern recordings there is a considerable reduction, occasionally as much as -12dB), the highest peak level is recorded in the metadata, so the volume as a whole can be lowered in advance to try to preserve any high peaks.
  • by PopeRatzo ( 965947 ) * on Thursday August 23, 2007 @01:37PM (#20332737) Journal
    This might possibly explain why music consumers today are willing to accept a highly compressed (dynamic, not data compression) product. There are several reasons as far as I can tell:

    1. High Fidelity was never really that important to the enjoyment of music. When I listened to the Stooges on my parents' cheap Sylvania stereo, I wasn't really listening for the lovely interplay between the oboe and the English horn. I wanted volume and a big, big beat.

    2. Popular music is much more rhythm and less melody than in the heyday of Hi-Fi, the 50's and 60's. When you've got producers who actually desire the lo-fi sound of some neolithic synth and then proceed to dirty it up with distortion and bit crushing, it's clearly not about getting a warm, natural sound.

    3. Much music is listened to via headphones these days. If you're trying to get the purest recording and reproduction of acoustic instruments, a pair of earbuds isn't going to cut it. Not a whole lot of popular music today requires pure recording and reproduction of acoustic instruments anyway, so what's the difference?

    4. Many of the great recordings of popular music were given a sort of distinction and personality by the type of production "mistakes" that are the bane of the hi-fi enthusiast. An example from the 2nd Rock Era is the cut Gimme Shelter by the Rolling Stones. There's a part on there where a tambourine will come in on the intro and it sent the VU meter way into the red, causing an ugly distortion that a "hi-fi" producer would have immediately thrown out and re-recorded. But the groove was there, a brilliant producer left it in, and now, whenever I hear that song it's that distorted tambourine that gives me the little shiver. Now, it's something that's sought out by producers. On the New Magnetic Wonder record by Apples in Stereo, there are cuts where some backing vocals are clearly recorded using a blown-out microphone. It sounds great to me, especially when I'm pedaling to work with my mp3 player cranked through my earbuds (yes, I know I'm taking a chance, using earphones when I'm riding in traffic, but it's such a joy that I accept the risk).

    When I listen to Sir Georg Solti's recording of Parsifal, or Glenn Gould playing the Goldberg Variations, or Miles Davis In a Silent Way, I want a true and warm recording of the sound of the instruments. Air moving through a horn, or a string vibrating, or a piece of wood striking a skin. If Ceelo Green is a Soul Machine or The Books The Lemon of Pink is on my box, how would I know if the re-recording of the sample of Bernie Worrell's string-synth from P-Funk Connection is a "true and warm recording" or not? All I know is it makes the juice flow. That's good enough for me.
  • iZotope Ozone (Score:3, Informative)

    by jilles ( 20976 ) on Thursday August 23, 2007 @01:56PM (#20333027) Homepage
    If you like to fiddle a bit with sound compression and other tools that are used in professional audio mastering, izotope ozone (a commercial product unfortunately) is quite nice to play with. Using a few basic edits can give flat sounding tunes nice warmth and depth. It's basically like the audio equivalent of photoshop and the techniques have very similar intuition.

    The problem is not so much the use of such filters but the fact that they are used to optimize recordings for the very mediocre equipment most people use. Subtle bass sounds are simply lost; as are quiet high pitched sounds, because cheap equipment doesn't do anything with this information anyway. To counter this, the trick is to boost the volume of such sounds (relative to the rest) and to shift the spectrum away from very high or very low sounds. Like manipulating photos generally leads to loss of detail and undesired artifacts, manipulating sound results in similar loss of detail and distortion of what remains. Commercial records are edited to the limit of crappy mp3 players and radio. It's the equivalent of boosting a photo's contrast so much that most detail is drowned out to make it look good on a good old matrix printer. The psychological effect is similar as well: we humans appreciate contrast in all sorts of ways and the matrix printer doesn't do grays very well anyway. Unfortunately if you have a high end inkjet printer, such photos don't look much better than on the matrix printer because there is no extra detail anymore.

    When used properly however, manipulating sound can improve quality significantly. Many expensive highend amplifiers basically contain lots of dsps to 'improve' the sound and do some restauration work on the distorted signal on the CD (e.g. by interpolating and reinserting detail that was lost in the mastering process). Old fashioned valve based amplifiers are all about sound distortion (in a pleasing way). This is no different than what happens in the studios except that the result would be much better if the studios didn't throw out so much detail. This point can be demonstrated easily by playing back some sixties/seventies recordings which have much less aggressive audio manipulation.
  • Re:The alternative? (Score:2, Informative)

    by digitalaudiorock ( 1130835 ) on Thursday August 23, 2007 @01:57PM (#20333039)

    Which knob do you adjust to increase the dynamic range and re-add the lost information?

    Oh that's right, you can't. You're right, it's not a tough choice is it?
    Absolutely...once you've crushed that peak to average level there's no getting it back.

    I have my own Protools based home recording studio. I get to experiment first hand with this sort of heavy limiting. Using a good limiter plugin (in my case a Waves L2) it's easy to make anything sound many times as loud as the original recording without introducing artifacts, but in addition to permanently loosing the dynamics, it becomes almost fatiguing to even listen to...and that's nothing compared to what mastering engineers are doing (against their own wishes by the way) at the request of their customers (the record companies). It really is criminal. The fact is that this sort of stupidity was impossible in the days of vinyl...the needle would have jumped out of the groove if anyone attempted it.
  • by Anonymous Coward on Thursday August 23, 2007 @02:17PM (#20333403)
    You run out of lens before you run out of film.

    A good number to use for optimistic estimates of 35mm film/digital equivalence is 100 line pairs/mm, or 5080dpi. That's around 35M pixels.
    You need some very very good glass to get that full-frame, and you need a tripod to get that even in the center. Going over 100 lp/mm is both expensive and rather technical.

    Actual obtainable results given typical usage are going to me more like 50 lp/mm, which comes out to around 9M pixels.

    The current crop of 15-20 megapixel DSLRs are about as good as the format is going to get, in terms of usable resolution.
  • Re:The alternative? (Score:2, Informative)

    by Anonymous Coward on Thursday August 23, 2007 @03:05PM (#20334119)
    I'm posting as AC because I already moderated here.

    I spent a year working for an absolute wizard at audio stuff; he worked at Bell Labs for 26 years and helped invent MP3. So I am not guessing at anything I say here.

    Sound-level compression is not that hard to do in real time. There are several ways to do it. The best way is to do a pure digital EQ using a computer model of how the human ear perceives loudness, and that feature is shipping today as part of Windows Vista (look for "loudness equalization" or something like that, I don't know what it is because I don't run Vista at all). Doing loudness EQ this way is roughly as computationally expensive as decompressing MP3, i.e. not too expensive by modern standards.

    Most sound-level compressors strictly use the power of the music to approximate the loudness of the music. This works perfectly when the music is sine tones, but doesn't work so well for real signals. Some parts of the music that hit your ear on a bunch of different frequencies will sound louder than their power would suggest; and these will be over-boosted by the sound-level compressor. (Most radio stations use a compressor on everything they broadcast, and you can hear "spitting" sounds when people say words with sibilants. Listen to a DJ saying "summer sales" and you will often hear spitting or hissing noises on the "s" sounds.) Some power-based compressors sound better than others (some audio engineers swear by really old-school equipment) but the digital loudness equalization really sounds the best.

    I hope your idea comes to pass, and music gets encoded with a full dynamic range, and just has sound-level compression cues encoded as well.

    But I also put hope in the Internet itself. With actual, physical media like CDs it would be too hard to sell multiple different versions, but with audio files sitting on a server for download, it would be very easy to sell the mass-market version and the "audiophile" version that has full dynamic range.
  • by riker1384 ( 735780 ) on Thursday August 23, 2007 @03:18PM (#20334291)

    3. Much music is listened to via headphones these days. If you're trying to get the purest recording and reproduction of acoustic instruments, a pair of earbuds isn't going to cut it. Not a whole lot of popular music today requires pure recording and reproduction of acoustic instruments anyway, so what's the difference?
    I disagree on this point. With headphones you can get good sound for much less money than speakers. Not with cheap earbuds, but a good pair of open-backed headphones can give you the same clarity as speakers costing up to 10 times as much. High-quality in-ear monitors are also becoming more popular for use with mp3 players.

Anyone can make an omelet with eggs. The trick is to make one with none.

Working...