Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Media Music News

Why Distributing Music As 24-bit/192kHz Downloads Is Pointless 841

An anonymous reader writes "A recent post at Xiph.org provides a long and incredibly detailed explanation of why 24-bit/192kHz music downloads — touted as being of 'uncompromised studio quality' — don't make any sense. The post walks us through some of the basics of ear anatomy, sampling rates, and listening tests, finally concluding that lossless formats and a decent pair of headphones will do a lot more for your audio enjoyment than 24/192 recordings. 'Why push back against 24/192? Because it's a solution to a problem that doesn't exist, a business model based on willful ignorance and scamming people. The more that pseudoscience goes unchecked in the world at large, the harder it is for truth to overcome truthiness... even if this is a small and relatively insignificant example.'"
This discussion has been archived. No new comments can be posted.

Why Distributing Music As 24-bit/192kHz Downloads Is Pointless

Comments Filter:
  • by Aboroth ( 1841308 ) on Monday March 05, 2012 @11:22PM (#39257027)
    You are missing the point of the article. 192KHz is not 192kbps.
  • Re:Pro recording (Score:2, Informative)

    by Anonymous Coward on Monday March 05, 2012 @11:28PM (#39257075)

    I recently remixed a classic recording for sony records. The files where rolled off of tape at 24bit/96k. 48k I can understand but 96k is pointless. WAAAAAAY beyond the range of human hearing. In the old days, things like cymbals and brass could really stick out because the encoders and decoders where just not where they are today.

    Anyone that tells you they can hear the difference between 48k and 96k is dreaming. Its the quality of the recording that counts more than anything these days.

  • Re:44KHz (Score:4, Informative)

    by belg4mit ( 152620 ) on Monday March 05, 2012 @11:32PM (#39257113) Homepage
  • by Sparohok ( 318277 ) on Monday March 05, 2012 @11:43PM (#39257197)

    A group of sixty audio professionals and audiophiles did a series of controlled double blind trials published in the Journal of the Audio Engineering Society. They found no perceptible degradation caused by a 16-bit/44.1kHz A/D/A.

    http://www.aes.org/e-lib/browse.cfm?elib=14195

  • Re:44KHz (Score:4, Informative)

    by GumphMaster ( 772693 ) on Monday March 05, 2012 @11:44PM (#39257203)

    The Nyquist-Shannon Sampling Theorem [wikipedia.org] basically shows that if an analogue signal contains no frequency higher than B Hz then sampling at any rate greater than 2B Hz is adequate to reproduce the signal without aliasing. In the case of audio recording intended for the human ear, the highest audible frequency is about 20kHz and the minimum sampling rate to cover that should be 40kHz. This is (partly) where the 44100 HZ sampling rate of CD audio comes from. In practice sampling is usually performed faster than required by the theorem (though not four times faster). The theorem is not sufficient in itself to guarantee perfect reproduction and is limited by the ability of real systems to match the mathematical ideals during sampling and reproduction. Reproduction is, however, typically very close.

    The 192kHz sampling that is the subject of this thread is capable of capturing frequencies well beyond the capability of a human ear to hear, or any typical speaker system to reproduce.

  • by xiphmont ( 80732 ) * on Monday March 05, 2012 @11:53PM (#39257261) Homepage

    Truthiness refers to a specific kind of lie-- a lie that sounds true, and that a large segment of people really want to be true. The kind of thing that's close enough to true for AM radio talk show hosts.

    And now... I'll get off your damned lawn. Don't forget to take your teeth out before falling asleep.

  • Re:Pro recording (Score:5, Informative)

    by king neckbeard ( 1801738 ) on Monday March 05, 2012 @11:54PM (#39257265)
    That doesn't make sense. 48k and 96K are sampling rates, so the problem wouldn't be in encoding and decoding. If there was a quality problem, it would be analog to digital converters those transferring to digital formats are using and the digital to analog converers a sound system has. You seem to be conflating sampling rate and bitrate. There have been dramatic improvements for the same bitrates in the last 20 years.
  • Re:Pro recording (Score:5, Informative)

    by Bassman59 ( 519820 ) <andy&latke,net> on Tuesday March 06, 2012 @12:26AM (#39257477) Homepage

    I recently remixed a classic recording for sony records. The files where rolled off of tape at 24bit/96k. 48k I can understand but 96k is pointless. WAAAAAAY beyond the range of human hearing. In the old days, things like cymbals and brass could really stick out because the encoders and decoders where just not where they are today.

    Anyone that tells you they can hear the difference between 48k and 96k is dreaming. Its the quality of the recording that counts more than anything these days.

    The difference is that the antialiasing filters are much simpler and have a gentler roll-off when sampling at 96kHz. The high-order filters necessary to ensure adequate attenuation at Nyquist and above when sampling at the lower rates have this tendency to ring.

  • by DeathFromSomewhere ( 940915 ) on Tuesday March 06, 2012 @12:26AM (#39257481)
    I don't care how highly you think of yourself, until you show me some data you are a worthless troll.
  • by Anonymous Coward on Tuesday March 06, 2012 @01:13AM (#39257837)

    The correct word is "verisimilitude."

    English is already diverse enough that we don't need to invent stupid synonyms for useful words that already exist.

  • Re:Pro recording (Score:3, Informative)

    by Graff ( 532189 ) on Tuesday March 06, 2012 @01:30AM (#39257947)

    Oversampling (i.e. 192kHz) allows much more room to develop a good anti-aliasing filter.

    *whoosh*

    As the whole point of the article goes right over your head! You do not need any anti-aliasing. If you sample at 40 kHz with a decent equipment and a good 20 kHz low-pass filter then you can completely and faithfully recover a signal of less than 20 kHz by applying the Whittaker-Shannon interpolation formula.

    Now we generally sample at 44.1 kHz in order to have some oversampling to take care of non-ideal filters and such. This is 10% oversampling and it's far more than you need with modern equipment and algorithms. By doing all this properly you will get the exact waveform back. There will be no aliasing to anti-alias.

  • Re:44KHz (Score:5, Informative)

    by tftp ( 111690 ) on Tuesday March 06, 2012 @01:47AM (#39258077) Homepage

    There may be no theoretical benefit, but since there's no such thing as an ideal sampler or filter or quantiser, it has many practical benefits.

    Here is a quick example. You sample at 44 kHz. The first Nyquist zone is from 0 to 22 kHz, the second one is from 22 to 44 kHz (with flipped spectrum.)

    Now, say that some [mechanical] harmonic from some instrument has frequency of 33 kHz. We don't hear those with our ears (parts of the ear are too massive to vibrate fast enough) so no harm done. The orchestra is playing as usual.

    But now record this orchestra with an imperfect antialiasing filter (there are reasons why a perfect one wouldn't do you much good anyway.) The 33 kHz harmonic falls into the 2nd Nyquist zone. It will be played back as if it was (22 kHz - 11 kHz = 11 kHz.) Can you hear 11 kHz? Most people hear it just fine. Think about it for a moment. There was no 11 kHz signal in the original spectrum; there was 33 kHz, an inaudible one. The artifact showed up because a [lossy] mathematical operation was performed on the data that describes the signal. The resulting distortion produced an audible tone where none was present originally.

    However if you encode at, say, 128 kHz sampling rate, things change. First, the antialiasing filter - even if it is of the same architecture - will have its cutoff way below the Fs/2. This means that signals of the second Nyquist zone will be attenuated by many tens of dB - essentially they can be completely eliminated because nobody cares what you do to ripple and phase above 30 or 40 kHz. Second, for the alias to show up it has to be in LF radio band now, starting at 128 kHz. Microphones aren't even mechanically capable of picking up those frequencies. And finally, if that 33 kHz harmonic passes through the filter (with the same mediocre attenuation as in the first example) ... it will be played back as 33 kHz, and it won't go anywhere. The amplifier will filter it, and the speakers will attenuate it greatly. In other words, a serious distortion that was present when you are sampling at 44 kHz disappears when you are sampling at a much higher rate.

  • by DMUTPeregrine ( 612791 ) on Tuesday March 06, 2012 @02:18AM (#39258243) Journal
    My last hearing test has shown that I can hear up to 21khz. I play Tin Whistle, Great Highland Bagpipe, Ceilidh Pipe, and Guitar. I have heard the rattle of a live sax. I have heard a delicate triangle ringing out over a live orchestra. I have heard live trumpet. I've spent quite a bit of time training my ears to hear those sounds.

    I have consistently failed to find a difference between the following in ABX tests I have run:
    192/24 and 44/16 .wav
    96/24 and 44/16 .wav
    44/16 .wav and FLAC, encoded with the FLAC reference encoder
    My reference tracks have been Pink Floyd's "Time", Sirenia's "Meridian", Bach's "Herz und Mund und Tat und Leben" part 7 conducted by Nikolaus Harnoncourt.
    The reference system was a PC with an Asus Xonar Essence sound card, a Rogue audio Perseus pre-amp, a pair of Rogue M-180 monoblock power amps, and Vandersteen Signature 2ce speakers. (My father's sound system and my PC).

    Of course, msobkow will claim that since I like Highland Bagpipes my hearing is inferior, and I can't hear the differences because he's better than me.

    That said, I do like having music in 192/24. Why? Because I can play with it. I can edit it, there's more headroom. If I feel that "Another Brick in the Wall" just needs a tin whistle part, well, I'll have an easier time editing it in without distortion. But for listening? Nope.
  • Re:Pro recording (Score:4, Informative)

    by GrahamCox ( 741991 ) on Tuesday March 06, 2012 @03:17AM (#39258569) Homepage
    You're right but only in theory. You must have a low-pass filter to prevent aliasing - ANY signal beyond that will be aliased (and sound appalling). Thus the filter needs to have a brick-wall characteristic which is impossible. So by sampling at a much higher rate, the filter can be a lot more practical. The 10% "extra" you get with 44.1kHz sampling is insufficient space to implement a decent filter - that sampling rate is something of a historical accident anyway.
  • by tkrotchko ( 124118 ) on Tuesday March 06, 2012 @04:19AM (#39258913) Homepage

    Many people think a "factoid" is a small fact. Actually a factoid is something that sounds true, but is actually false.

  • by BlackPignouf ( 1017012 ) on Tuesday March 06, 2012 @05:47AM (#39259221)

    I insist on getting 24bit raw images

    Maybe you should insist more, because they're no such thing as a 24bit/channel camera.
    AFAIK, the highest bit depth you can get is 16bit/channel on high end medium-format sensors.

  • There was already a perfectly good word [wiktionary.org] for that.

  • No smooth (Score:5, Informative)

    by DrYak ( 748999 ) on Tuesday March 06, 2012 @07:00AM (#39259455) Homepage

    The higher the sampling rate smoother the signal.

    Well... no. There's enough information in a low sampled curve. As TFA explains it, the output isn't "jagged" when played back in analog.

    Human perception wise a audio signal recorded at 96KHz sampling rate might well be indistinguishable from one sampled at 192Khz

    as explained in the article:
    - Yup the human ear won't hear anything aboe 20kHz sounds, because it doesn't have any receptors for that.
    But there are some real-world problems that come into the mix. No audio installation is perfect. You always get distortions.
    - Thus, a 192kHz sampled file could contain frequencies up to 96kHz. These are sound which can't be heard in theory. In practice if you throw 96kHz frequencies to a sub-optimal speaker, the speaker can barf a lot of distortions, including distortion below the the 20kHz. So not only are you trying to output a sound that can be heard, but you force the speaker to produce bad noise *which* is audible.

    But my thinking is that future technologies might let you do interesting things with the extra bit of data which is useless to us right now.

    Hard to do anything with those bits at all. We simply lack the anatomic feature to do anything with them. Unless you do something like transpose everything at lower frequencie (slow down everything 2x = move everything 1 octave lower). At which point you aren't really outputing the original sound anymore. You're simply using the data to produce new sounds that weren't here to begin with.
    The only practical use-case for this would be zoologist studying animals whose sound are beyond the human hear range. In that case "moving everything a couple of octave down" would help the scientist have an approximation with which he can work (to find rythms or other variation that are inaudible in the original frequency range). But that has nothing to do with hearing music made by human, for humans, with instruments designed for human hearing ranges.

    Kind of like with digital pictures which are too noisy or blurred, but which might be cleaned up with future algorithms to give us a slightly more useful picture.

    The situation with pictures is slightly different. What you're speaking about is spacial frequency. I.e.: resolution.
    And human eyes can percieve way much more than some blurry low-res pictures. And in addition to that, there's this thing called zooming which makes perfectly sense to record picture at higher resolution. Because looking at details is simply looking at the same picture at another scale.

    The "visual equivalent" to 192kHz sounds would be recording colours outside the human range. Like recording also infra-reds, microwaves, ultraviolets, and X-Rays.
    Things that can't never been seen, because human lack the corresponding apparatus. The only way to get someting out of this extra data would be to transpose it into the visible domain. Thus use pseudo-colours to display levels of low infrared (heat), etc.
    Just like the "zoologist" use-case above, there are a lot of scientific use-case where that could actually make sense (as an exemple, think about all the data collected by astronomers).
    But in no way is it useful to record X-Rays to enjoy a painting by some known artist. The painting was done by a human painter, for human public, using colours chosen for their effect on an un-aided human visual system, disposed on a canvas in a way which is pleasing to the eyes.
    (Well, okay. I know that some scientist use infra-red or X-ray image of paintings to analyse how they were done, what are the layers underneath or if there's even another picture over which the current one was painted. But these are scientist analysing the paint, so we're agin on the "scientific analysis" use-case).

    24/192 makes sense as an intermediate format to avoid rounding errors, aliasing during filtering, etc.
    There could be also some scientific value to keeping

  • Re:Pro recording (Score:4, Informative)

    by petermgreen ( 876956 ) <plugwash.p10link@net> on Tuesday March 06, 2012 @08:20AM (#39259721) Homepage

    Recording a signal with high fidelty is NOT a matter of just taking samples at defined intervals. If you do that you will get aliasing (higher frequencies getting converted to lower frequencies by the sampling process). So before you sample you need an "anti-aliasing filter" to remove signal components above the nyquist point.

    However filter design is a compromise, a filter with a steep response in the frequency domain will have a long impulse response in the time domain. A filter that doesn't cause phase distortion will cause pre-echo when fed with an impulse signal. Further making high order analog filters reliable and well behaved is difficult.

    Similarlly at output many digital to analog conversion methods will produce unwanted copies of the signal beyond the nyquist point, again a filter (known as a reconstrution filter) is needed to remove these.

    96KHz gives you a much bigger "gaurd band" between the audio signal and the nyquist frequency so your anti-aliasing and reconstruction filters can be much less aggressive.

    Using oversampling (running your recording/playback devices at higher than the sample rate you are storing the music at) and doing most of the filtereing digitally can remove the issues with high order analog filters being unstable but it can't change the fundamental issue that a filter with a sharp response in the frequency domain will have a long impulse response in the time domain or that a filter with no phase distortion in the frequency domain will have pre-echo in the time domain.

  • Re:No smooth (Score:2, Informative)

    by Anonymous Coward on Tuesday March 06, 2012 @09:05AM (#39259913)

    No, but it is *aliased*. The waveform between two samples is a simple interpolation. It is probably pretty close to the original sound, but there will always be some error too.

    You need to re-read Nyquist. The reason for the 2x minimum limit is to avoid aliasing.

    You don't need the "waveform between two samples" because you're reconstructing the sine wave at the highest frequency those samples represent. Any other waveform will contain harmonics above the limit, and should be filtered out before sampling.

  • Re:Pro recording (Score:5, Informative)

    by scary_jeff ( 538884 ) on Tuesday March 06, 2012 @09:07AM (#39259927)
    I also spent 4 years studying an EE degree, and although it was not especially focused on signal processing, I now work for a large pro audio company.

    Some of the issues pointed to in this and other posts regarding oversampling and AA filters are not really relevant to the subject at hand, given the technology currently in use. A statement like 'oversampling at 192 kHz' shows a lack of knowledge regarding the kinds of audio converters that have been in use for a good while now. A Delta Sigma ADC running with an Fs of 48 kHz might often be oversampling at 3.072 MHz or 6.144 MHz. Anti aliasing filters that many people have mentioned are implemented digitally inside the converter (no need for external analog filters, which may well exhibit many of the problems mentioned), and actually have extremely good pass band ripple.

    Look at datasheets for converters from manufacturers such as TI (burr brown) [ti.com], cirrus [cirrus.com] [page 36 here has detailed plots of 48, 96, and 192 kHz pass pand characterisitcs for the device, highlighting the fact that increasing the sampling rate does not improve pass band ripple for this device (also note the scale is 0.02 dB/div)], AKM [asahi-kasei.co.jp], Wolfson micro [wolfsonmicro.com] You will find pass band pass responses that are flat to within less than +/- 0.05 dB over the audible range, and stop band attenuation in excess of 100 dB, whether sampling at 48 kHz or 192 kHz. If you can find anything in actual converter datasheets that points to better converter performance from selecting a higher sampling rate, I would be interested to see it.

    All in all, the basics of sampling theory don't really help people to understant the real world issues in designing a moden high end audio device. And in the end, surely the proof of the pudding is in the blind tests, that never seem to show that anybody can tell any difference when moving to higher rates? Even if there were a few people who could hear this difference in some perfect listening envirmonment, would it really make sense for everyone else to go out and buy 192 kHz equipment?
  • by mvdwege ( 243851 ) <mvdwege@mail.com> on Tuesday March 06, 2012 @09:44AM (#39260219) Homepage Journal

    BS. If the overtones of a flute high C and a piccolo high C are both under 22Khz, then sampling at twice that will catch all the overtones, and replaying the sample at the same rate will perfectly reproduce them.

    And if the overtones are over 22Khz, but their lower-order harmonics aren't, the sampling will pick up the harmonics and reproduce them perfectly, even without the existence of the original overtone.

    There is no subjectivity in that. An oscilliscope will show you that the overtones and/or their harmonics are all there.

    The only step that decides whether or not the overtones have any influence is the quality of the low-pass filter. At 44Khz that can be a bit iffy, so using 48Khz to get a little more headroom is nice, but in practice you won't be able to hear a difference with anything above that.

  • Re:No smooth (Score:4, Informative)

    by Dogtanian ( 588974 ) on Tuesday March 06, 2012 @10:10AM (#39260431) Homepage

    You may not be able to hear the higher frequencies, but when they're sampled with a too low sample rate, you'll be converting waveforms you can hear.

    Nyquist assumes that the signal to be sampled does not contain any frequencies higher than half the sampling rate. Any that exist thus *are* expected to be filtered out beforehand, otherwise aliasing will occur.

    Try it for yourself on paper.

    The "samples" do *not* represent the final "reconstructed" wave (are you suggesting the same "join the dots reconstruction" misconception that most people have about Nyquist?). My understanding of Nyquist (probably incomplete and far from perfect, but still miles better than most people's fundamental misunderstanding) is that this sample output has to be filtered so that all the harmonics above half the sampling rate are removed. Since Nyquist only says you get perfect reconstruction for frequencies up to that limit, there's no contradiction there.

    A "perfect" square wave (which can never actually be created in the real world) has harmonics of infinite frequency, and even a "real-world" as-near-square-as-makes-no-difference-wave will contain very high harmonics. If one was to do a Fourier transform on a square wave, filter out all the frequencies above the human range of hearing, then convert it back to the familiar (spatial domain) wave form, it wouldn't be square any more.

    Therefore, you can't sample a square wave using standard techniques anyway.

  • by Anonymous Coward on Tuesday March 06, 2012 @11:07AM (#39261029)

    At that sample rate a 15kHz tone has only three samples. With only three samples there's no way to accurately draw the waveform. With three samples there's no way to discern between a sine wave, a square wave, or a sawtooth wave.

    I wish you guys would get this right. There is absolutely no way you can tell the difference between a 15kHz sine wave, square wave, or sawtooth wave (apart from amplitude, perhaps).

    Sawtooth waves have even and odd harmonics, and square waves only have odd ones. This means that the first harmonic of a 15kHz sawtooth wave would be at 30kHz, and the square's 3rd harmonic would be at 45kHz. As you pointed out, even if you could hear them, you'd have to have damn good speakers to reproduce.

    Three samples is enough to reproduce the 15kHz fundamental per Nyquist.

For God's sake, stop researching for a while and begin to think!

Working...