Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Music Media

Sony Super CD: More Bits, More Bucks, Mo' Betta? 309

Reader dcigary pointed to this "nice writeup on the new Sony Super CD." Though the explanation of the difference between supposedly revolutionary "DSD" recording over conventional digital seems to get by with a knowing mumble, the piece does mention the price (high) and that competition from audio-only DVDs may cripple acceptance of the new format. Even if I like the idea of ultra-fidelity, my faith in the Nyquist theorum is too strong to spend a grand and a half on a CD player anytime soon ...
This discussion has been archived. No new comments can be posted.

Sony Super CD: More Bits, More Bucks, Mo' Betta?

Comments Filter:
  • by XNormal ( 8617 ) on Sunday October 15, 2000 @03:05AM (#705156) Homepage
    Virtually all audio A/D and D/A converters today use sigma-delta, also commonly referred to as "one-bit" conversion.

    In a sigma-delta A/D converter the audio signal is sampled with a high sampling frequency (typically a few MHz) and low sample resolution (1 bit). An error feedback mechanism is used to ensure that most of the energy of the quantization noise is "shaped" into high frequencies, giving excellent fidelity in the audio band. One bit is inherently linear - no need for carefully matched resistor networks such as those used on older A/D converters. This stream is then filtered and decimated using digital signal processing techniques to a lower sampling rate (e.g. 44100Hz) while gaining sample depth on the way (16 bits and higher).

    For D/A conversion the process is reversed: the 44100Hz signal is interpolated up to a high sampling rate and then the sample depth is reduced down to one bit. Again, error feedback is used to ensure that the quantization noise resulting from the low resolution is shaped to high frequencies. This bitstream is then low-pass filtered and used as the audio signal. Again, with much better linearity than D/A converters based on carefully adjusted resistor networks.

    The Sony SACD skips the decimation and interpolation stage. It stores the noise-shaped bitstream directly on the disc. The beauty of this idea is in its simplicity: it performs much less transformations on the sigma-delta signal and therefore should offer inherently higher fidelity and wider bandwidth.

    If sigma-delta converters were available 20 years ago when the CD was invented they would probably have chosen this method for its simplicity. But at that time the analog conversion technique known was resistor networks so PCM was used.

    Remember that at the time the CD was really stretching the limits of consumer technology. No other consumer product prior to the CD player used so many new and advanced technologies: lasers, error correction, digital signal processing. If they could have used this technique it would have reduced the cost of CD players significantly. For example, this bitstream is much more tolerant to bit errors because unlike PCM there is no "most significant bit" that can cause a large error if corrupted.

    Using this technique today, though, is insane. There is no real savings in simplicity when a million digital transistors cost close to nothing. If you want higher fidelity, 96kHz and 24 bits is more than enough.

    Let's say you want something simple like a graphic equalizer on your SACD player. If it's analog such a complex circuit will introduce lots of noise. If you implement it digitally it would take insanely large amounts of CPU power to process a signal sampled at over 2mHz. Manufacturers will probably end up downconverting it to PCM at 96kHz or lower, doing the signal processing and then converting back to sigma delta for playback. This will lose all of DASD's alleged advantages.

    BTW, for the purpose of preserving analog masters DASD is really a good idea because they contain useful information at very high frequencies such as the tape bias signal and the intermodulations it creates. Preserving this information will allow future signal processing techniques to create accurate models of the nonlinearities of the magnetic medium and use this high frequency information to reconstruct the original recording with better fidelity down in the audio band. For home use SACD is a very bad idea. Just about the only good thing I can see about it is that it can be marketed effectively because it's such a "radical new concept".

    The DVD audio uses conventional, well proven PCM with somewhat higher sampling frequency and bit depth than CD. Why use a higher sampling frequency when we can't hear over 20kHz? It turns out that while we can't hear a sinewave at frequencies higher than 20kHz the high frequency components of complex waveforms make a noticable difference even up to 26kHz. To take a good safety margin and maintain integer ration a 96kHz sampling rate was used. This does not significantly hurt the data rate required because non-lossy compression is used on DVD audio. A compressed 96kHz signal takes about 30% more space than a compressed 48kHz signal. 16 bits is, again, almost enough. In fact, with proper in-band noise shaping the noise floor is inaudible in all but very extreme circumstances. 24 bits is therefore a very good safety margin.

    Another reason why DVD-audio is superior is because it supports Ambisonics [ambisonic.net]. Ambisonics is a surround sound system. It was not crated for cinematic effects. Ambisonics was designed for music and for reconstructing the subtle spatial cues of the ambience of the recording venue. With a proper arrangement of speakers it can create true 3D sound - including the height dimension. Imagine listening to a recording and feeling the height of the concert hall!

    Please never ask "how many channels does Ambisonics use" because it's not a relevent question. Ambisonics deconstructs the 3D sound field mathematically using a four component representation (XYZW). This representation can be processed with a simple linear matrix for playback on different speaker configurations and numbers of channles with varying levels fidelity of 3D soundfield reconstruction. This includes the popular 5.1 setup used on home theaters (it's probably going to be the default settings for DVD-Audio players) .

    DVD-Audio is also backward compatible with DVD players although a DVD-audio player will be required to take advantage of all the features and full quality.

    More information about DVD-Audio here [ambisonic.net]

    ----
  • by Andy Dodd ( 701 ) <atd7NO@SPAMcornell.edu> on Saturday October 14, 2000 @10:20PM (#705158) Homepage
    A few articles ago, someone suggested that Malda & company implement a "Just plain wrong" moderation option.

    It's mathematically provable (and the end result of said proof is the Nyquist theorem) that all you need to do is sample a signal at 2x its maximum frequency, and you can recover that signal EXACTLY.

    Sample the signal, later pass that sampled signal through a low-pass filter. Afterwards, the only difference between the final signal and the original is a constant multiple. (For example, if you sample with infinite impulses, which have infinite height and zero width but an area of one, the end signal is 1/T (where T is the period) times the original.) The aforementioned technique is "ideal sampling" - which never occurs in the real world. With practical sampling, you get a different multiple than 1/T. But the end result is the same.
  • It's one bit. On or off. Here's how it works: imagine a waveform- at any point the angle of the wave might be up / or down \ or just flat -

    The way the Sony thing works is like this: at each point the encoder asks 'am I above or below the actual wave form?' If it's above, it reduces the angle of its recording by X degrees. If it's below, it increases it by that much. It is constantly overshooting and crossing over the actual waveform- at about 2 megahertz, not some mere 96K. At no point does it record the signal voltage itself- it only follows the changes in angle, at a rate so high that it's way beyond anything that will be recorded, and it's kind of 'lossy' as it'll almost never be _exactly_ on the target waveform- the oscillation of it tracking the waveform will be higher than 96K anyhow and it will be a _sine_ oscillation, not the square-stepped, nasty distortions of raw PCM sampling, so it won't even need to be filtered.

    I'd love to see this actually catch on- as far as sound is concerned, nothing else should be needed. The interesting thing is this- could this format be _synthesised_ digitally? I'm picturing some future sort of audio workstation where you have all the modern gimmicks like pitch correction, EQ plugins etc, but you never resample anything- just overlay all the different sample rates you end up with, seamlessly :)

    It would be interesting to see the venerable Yamaha DX-7 redone with this technology for its audio outputs! :)

  • On a somewhat-related note, it is remarkably interesting what effect a more accurate clock signal has on the quality of a 44.1KHz recording

    Thank you, thank you. This phenomenon (sometimes known as "clock jitter") also explains, in large part, the age-old argument as to why digital-to-digital copies are not always perfectly identical, despite the notion that "it's only copying numbers/bits, it has to be perfect". Any digital recording references a time base, and any variation in that time base skews the way the audio sounds when you play it back. Commercial recording studios pay very large sums for centralized, highly accurate clock sources to which each piece of equipment that handles a digital audio stream is synced.
  • by Kiwi ( 5214 ) on Saturday October 14, 2000 @10:23PM (#705162) Homepage Journal
    SACDs do not utilize PCM audio at a higher encoding rate, the way the upcoming DVD/A standard will. Instead, it uses a special encoding with, basically, allows you to have higher frequencies at lower resolution or lower frequencies at higher resolution, where the base resolution is a low bit rate in the megahertz range.

    Arny Kruger has a lot of misgivings about this method. First of all, it is a lot harder to code DSP for this instead of PCM. Second of all, there are apprantly problems with high-frequency artifacts that this encoding technique uses.

    I also have a lot of misgivings about this method. Personally, I think the average listener thinks 16/44.1 is good enough, and has no need to listen to something at a even higher bit rate. The popularity of MP3s indicate that 16/44.1 is better sounding than what the average consumer needs.

    - Sam

  • by The Mayor ( 6048 ) on Saturday October 14, 2000 @10:28PM (#705163)
    Even if I like the idea of ultra-fidelity, my faith in the Nyquist theorum is too strong to spend a grand and a half on a CD player anytime soon ...

    You've got it all wrong. You see, humans have an approximate range of hearing between 20Hz and 20KHz (assuming no hearing loss). Now, Nyquist theory says that you must sample at twice the frequency of the highest frequency you wish to preserve. We use 44.1 KHz for this. Sounds good so far, right?

    Well, what many neglect to mention about Nyquist theory is that you must run the resulting output through a filter. The filter, according to the theory, is a brick-wall filter. Of course, these things don't exist. Filters have a roll off. As a result, people invented the concept of oversampling. This way, you move the sampling frequency way above 44.1 KHz, and you can put a filter in at, say, 100 KHz. Nice, right? Wrong. Filters have audible effects *well* below their -6dB level.

    That said, there's still another problem. People have an approximate dynamic range for their hearing of 120dB. Using 16 bit samples (like CDs), you end up with only 96dB of theoretical dynamic range. So people invented the concept of running a low level noise input when digitizing. This ends up pushing the dynamic range above 96dB. This is how most modern CD players can claim a dynamic range of about 102dB. Sounds good, right? Wrong. You're increasing the dynamic range, but you're also increasing the noise. This is not good.

    Of course, for 1980, CDs were pushing the limits of technology. Now they're not. Now we have DVDs. With DVDs and compression, you can get 5 channels of sound digitized with 24-bit sampling at a sample rate of 96kHz. Now that kicks ass. Of course, all the different parties messed around with the standards committees long enough to pretty much kill DVD-Audio (it was finally released a few months ago, but there is way too little material released under the format).

    So, if you want a cheap CD player that truly sounds good, I recommend you get a DVD player, and listen to music on DVDs or DVD-As. Even a cheap one can have a pretty poorly built filter and sound OK. Of course, cheaper D2A converters have their own problems, like jitter. But that's a story for another time.

    Look, if you don't believe me about this quality issue, go to your local high-end (and I don't mean that they carry Denon and Yamaha...I'm talking about equipment like Wadia and Krell), and ask them to do some listening tests with a Krell compared to your $189 Technics (or whatever). If you don't hear a difference, then you either have hearing loss or you aren't used to paying attention to sound quality. Like many elements of perception (sight, hearing, etc), the more you work the sense, the more acute it becomes.
  • ANY signal, especially a periodic signal, can be represented as a sum of sine waves. (This is the basis of the Fourier transform.)

    What is a 15 kHz square wave? It's a 15 kHz sine wave plus sine waves at multiples of 15 kHz. The human ear can't hear those multiples, hence can't tell the difference.

    What's my point? My point is that if you sample at 44.1 kHz, you catch ALL of the portions of your signal that are human-hearable. Yes, a human can tell the difference between a 500 Hz sine and a 500 Hz square - that's because there are many harmonics with the human hearing range. But I can assure you, if you play a 15 kHz sine wave and a 15 kHz square/triangle/sawtooth wave, you won't be able to tell the difference.
  • A lot of what you (nathanh) say is true. But it's clear you don't have the best ears on the planet--do more music listening, even on mediocre equipment, and you'll hear the things you claim can't be heard.

    Quit getting so excited about what you know. Just because you've read about signals somewhere doesn't mean you understand human auditory function. And remember that we don't really understand any of this stuff--we're just constructing models. If I can hear what you say I can't, then your _model_ is wrong, not my ears. After all, the models are trying to conform to reality, not vice-versa as you seem to argue.

    I am a mathematics grad student, familiar with some signal processing, and critically inclined. I also am passionate about music. I find your comment "Most audiophiles are full of shit" offensive--I know several audiophiles and none of them are full of shit. Some are musicians, some are electrical engineers. All of them have highly trained ears.

    I don't care that "most people's speakers have trouble doing better than -3dB at anything over 18kHz anyway"--mine are spec'd -2db at 22kHz, and they only cost $450 for the pair (Sound Dynamics 300ti). I take exception to your comment "So it's hardly any loss at all to throw away the sine waves over 20kHz"--hardly any loss to whom? I would benefit from better-sounding recordings. I do care about the high-range.

    Finally, not everything can be represented by sinusoids, as you claim. Fourier claimed, IIRC since I'm not going to check, that all L2(0,1) functions could be represented by an infinite sum of sinusoids. I don't understand why all changes in air pressure must necessarily be in L2(0,1)--and unless a patient physicist tried explaining it to me, I probably wouldn't believe it anyway.

    -Paul Komarek
  • Believe what you want, but there was an old quad format that operated by adding amplitude modulated 50K tones. I'm told it worked. There's not much room for theory here: it would either work or not, you're saying it couldn't, and what I hear it did work in the real world. Not that quad ever became mainstream :)

    Console yourself with the fact that after relatively few playings the AM quad information would be scraped off the LPs anyhow ;)

  • If you do the math, you'll see that the maximum error you obtain with 16-bit sampling is all you need given that line-out signals are 3 volts peak-peak. I don't remember the exact voltage, and I'm too tired to re-work it now. (3 volts divided by 2^16...) But the maximum error of a 16-bit sampled signal is FAR lower than the minimum electrical noise added by even the best super-high quality amplifiers with gold-plated speaker wire contacts and the like.

    Now while this is entirely true it gets more interesting than this in real-life. There is a need for better than 16-bit resolution. Here's why.

    You know that you reconstruct your signal from the samples by using zero-order-hold on each sample, then applying an ideal low-pass filter at half the sampling rate.

    Sadly you can't make ideal low-pass filters for any money. In fact it's hard to make even good low-pass filters. The solution is to interpolate the signal first then use a much cheaper and less accurate low-pass filter.

    Linear interpolation (first-order-hold) is common but produces nasty results. Bandlimited interpolation (sinc pulses) is a much better method but difficult to implement. In practise you'd use something halfway between the two.

    However all interpolation methods rely on the sample being as close to perfect as possible. If you're using first-order-hold then you will only get the error from the nearest two samples. But if you're using bandwidth interpolation then the sinc function means ALL THE SAMPLES are used to create your interpolated sample.

    In practise several dozen samples will be combined to create your interpolated sample. The errors from all these samples combine to create one huge error on your interpolated sample.

    24-bit samples greatly reduce interpolation errors. 16-bit sampling is only good enough if we have perfect DACs and ideal low-pass filters. 24-bit samples will allow the use of practical DACs and cost-effective low-pass filters.

  • by Chmarr ( 18662 ) on Saturday October 14, 2000 @07:02PM (#705182)
    It always seems the way with sony. THey come up with a technology, some good, some bad, but they keep it all to themselves and the technology goes nowhere but for very small niche environments.

    This one sounds like it'll be no more popular than minidisks.

  • First of all, things above 22kHz aren't picked up by ordinary mics... Even the ultra-high-end Neumann U87Ai only claims 20-20kHz frequency response
    Then again, the U87 is a large-diaphram condenser, and is more about giving a "warm" instead of an "accurate" sound.

    B&K makes mics with a reponse up to 40khz [dpamicrophones.com] (I know they are called DPA mics today, but I still call them B&Ks). B&Ks (and Genelecs) are what people use when they want a really accurate sound, as opposed to a "warm", "larger than life" sound.

    That said, I agree with what you are saying. People can not hear about 20khz, and people that think they can will need to run some double-blind scientific tests to back up their claims before I will seriously listen to them.

    - Sam

  • It seems most new technologies now have a catch: DVDs and the mess over DeCSS and the MPAAs desire to sue, MP3 encoding and the RIAA drooling to sue anyone who even looks at an MP3, intellectual property right, DMCA, who owns what, etc.

    I'm obviously a huge fan of new technology, but is there a catch somewhere in Super CD? The article was big on technical, not on much else. Anyone have any insight to this?
  • by darkwiz ( 114416 ) on Saturday October 14, 2000 @07:08PM (#705188)
    Nyquist's theorem states that the highest frequency that can be represented is one half the sampling rate. This is obvious because you must be able to detect at least a peak and a valley of the sound wave.

    Nyquist's theorem does not imply, however, that the representation of the maximum [or near maximum] frequencies will be highly accurate as far as the shape of the wave form is concerned. At and around 1/2 sampling freqency, the wave forms become basically nothing but square waves [alternating between a single high, and a single low point]. In order to deal with this, some sound decoders will attempt to interpolate the waves, but they cannot reproduce the original sound accurately. This is why higher sampling frequencies ARE relevent to higher audio fidelity. Higher bit resolutions are arguable though...
  • Is it physically possible to have a 5K pulse wave that's very thin? Not 22K, _5K_. But so thin that the pulse is 1/4 of the duration of the wave.

    If you try to sample this at 44.1 and reconstruct it, you're hosed- you have substantial problems based on the fact that the pulse would be quite accurately tracked by a 20K rate but you're trying to force a 22.05K rate to track it. It doesn't help to throw away all the very obvious and unavoidable over-20K components (come on, _5K_ and you get only a couple of harmonics? That's nuts): you're still hosed by the interference pattern that is produced, because these are not simply a pile of sine-tones: they are sine-tones (lots of them) AT SPECIFIC TIMES. Just because 20K is lower than 22.05K doesn't mean 22.05K is going to do even a remotely acceptable job of sampling it.

    Remember that we are talking about a _5K_ wave here, not something absurdly high. Do you not think you can hear the difference between a really thin pulse wave and a sine at _5K_? Yet the pulse only needs to be about 1/4 width to cause really severe problems with sampling- specifically, a rapidly cycling distortion component that will go from 0% distortion to severely, severely distorted at a rate fast enough to make up an entirely new, harmonically unrelated tone! This is only lessened by throwing away frequencies over 20K, not entirely fixed. You can lessen it a lot more effectively by throwing away all frequencies over _10K_ but come on- whoever heard of a pulse wave with only two harmonics? Our 5K pulse wave would be unrecognizable- this has to be considered distortion.

    I'm afraid Nyquist is mostly crap- it would be fine if all musical frequencies were exact subdivisions of 44.1K, but when you start dealing with real-world frequencies, you start having these intermodulation problems, and the problem is not that you're not having the supersonic frequencies, the problem is that the resulting intermodulation effects are cyclical degradations of continuous tones and produce _inharmonic_ tones. It's very hard to hear slight differences in harmonic tones but it's not hard to hear problems when there are inharmonic tones being synthesized.

    Extra credit- what inharmonic tone is generated by the distortion on a 5K 1/4 width pulse wave sampled at 44.1K? Use as much treble-rolloff as you want, it'll always be the same intermodulation frequency. Anybody good enough at math to say what frequency is generated as intermodulation distortion? :)

  • Sony has had SACD for quite a while now. I remember picking up some audiophile magazines over a year ago and reading about it. They've released a lot of Sony Classical stuff on it.
    However, a lot of the systems to play SACD cost over $1K for the cd player. Not to mention you're not going to get the nice dolby surround of DVD at the same price....
  • No, what the MP3 revolution shows is that almost all music fans value the content more than the technical reproduction.

    I think you're mistaken. Content doesn't play a role here. Essentially everything available in MP3 format is also available in an uncompressed digital format. But most the vast majority of MP3 listeners can't hear or don't care about the extremely large quality loss that is inherent in the MP3 compression algorithm.

    This is the same group of people that use the term "high end car audio" without a hint of irony.
  • by hatless ( 8275 ) on Sunday October 15, 2000 @04:43AM (#705196)
    So here's a digital format that should please nearly all the classical music afficionados out there who spend tens of thousands of dollars constructing acoustically-perfect "listening rooms". Nothing bad about that. At the very least, it finally creates a reasonably lossless way to digitize analog material for archival and preservation purposes--although any archivist will tell you that the real archives themselves for long-term preservation should be old-fashioned stamped analog discs.

    These two markets--archivists and money-is-no-object audiophiles--should be covered with about 20,000 of these devices. So what about the rest of us? I have serious doubts that the difference between this and DVD-Audio can be heard on even a $3,000 home theater system.

    Sony (and presumably Philips/Magnavox) intend to build support for this into all of their players starting sometime soon, maybe a year from now. The thing is, nearly all the DVD players being sold today can play the competing DVD-Audio discs. None, not even Sony's, and not any of those millions of Playstation2s shipping in the next year, can play SACDs.

    Ultimately, this is about patent royalties. Sony and Philips have been collecting royalties on every CD player and CD drive sold for over a decade now, and SACD is about trying to do it again for another decade. DVD-A is the format endorsed by everyone in the industry except Sony and Philips. Is it a good professional archival format? Nah. Is it both better and more flexble than CD? Yep.

    So here's the ugly truth. The MP3 revolution seems to have proven that most people have tin ears. Ask a hundred people. 98 of them will tell you that 128Kbps MP3 is "CD quality". Fact is, it's inferior to Minidisc, to FM radio and--in many respects--analog cassettes. But it doesn't have hisses and pops, and that's all most folks really notice. Heck, 320Kbps MP3 sounds crappy next to a CD, even on a $400 stereo.

    If people think MP3 is "good enough"--when it can't even hold a candle to CD--why is the mass market going to embrace SACD over DVD-A? Especially when they'll have DVD-A players available from dozens of manufacturers and SACD players most likely available from... three?

    CD will be superseded, not because most people want higher-resolution sound quality they can't hear on Britney Spears remixes, but (1) because DVD-A and SACD players will offer things like 6-channel sound and bundled-in DVD video clips, and (2) because the record industry will stop making CDs, just like they stopped making LPs, in order to force everyone to buy the new players and buy yet another copy of Billy Joel's Greatest Hits to go with the LP, cassette and CD they already have.

    The best format won't win. The more ubiquitous one will. The question remains which coalition will blink first. Will the Sony-Philips side break down and allow their record companies to start making DVD-As once they see SACD players aren't selling well, or will companies like Matsushita start paying royalties and buying chips from Sony because the Sony/Philips DVD-A embargo has made it impossible to get record stores to carry DVD-As?
  • I don't know about these minidisks. I live in the netherlands but I have no use for them. Nor do I know anybody who owns one. Apparently if you buy a Sony stereo set it has a player. So I think it's more or less a failure everywhere.

    Of course there are people who buy this shit, but I think philips cd recorder (for in your stereo set) has been a bigger hit.

    In any case, I don't care for sony's new super audio cd. I won't consider buying one until there is a massive amount of content for it. I think it's rather naive to tie a streaming format to hardware these days. What about playing these things on my PC? Can create mp3 from it? Can I convert my existing music collection?

    Answer: probably not. So, no thanks. Both philips and Sony have made some mistakes in the hardware/content area. Philips had the CDI (do you americans even know what that is?) which was soon nicknamed the poorman's cdrom. I think a lot of people regret buying one. Both philips and sony had their video tape formats, sony currently has their minidisk which was obsolete at introduction time.

    This thing has failure written all over it. The media of the future is DVD, nobody wants yet another completely incompatible cd standard.
  • In the tradition of geekdom and Usenet, I will point out a minor problem with your numbers: (24 bits)*(96kHz)*(5 channels) = 11.5 Mbits / sec 1x DVD speed delivers only 10 Mbits / sec. Therefore, a 24 bit, 96kHz, 5.1 channel disc wouldn't be legal DVD-Video.

    Yes, but using 2:1 or 2.5:1 lossless compression is so easy, that halves the data rate. I'm not sure what compression Meridian Lossless Compressions achieves, but I'm pretty sure it exceeds 2.5:1.

  • Sony has had SACD for quite a while now. I remember picking up some audiophile magazines over a year ago and reading about it. They've released a lot of Sony Classical stuff on it.

    This is very old news indeed. An audio-video store I consult with has had demo units of both the Sony SuperAudio CD and DVD-Audio for months.

    The ironic part of course is, the only people who listen closely enough to music to hear the difference between these enhanced formats and regular CDs are the same folks who will be able to hear the effect of the audio watermarking. The watermarking must be audible, as they wish it to be detectable even when all the inaudible portions of the music are thrown away by a process like MPEG encoding. They want it audible when the music is recorded to cassette. And, possibly most important from their perspective, they want it audible when the music is played though the compressor at the radio station - so they can automatically log all the plays and get their ASCAP and BMI money from the stations.

    I've heard A/B comparisons at the Consumer Electronics Show and the Custom Electronic Design & Installation Association show. They both do sound better than standard CDs - though not that much better than the standard-CD compatible HDCD process. And the most annoying part is that they are both entirely unecessary. A standard DVD can provide 5.1 channels of uncompressed audio at up to 24 bits at 96 khz. As a Dolby engineer explained to me, this is enough dynamic range to reproduce the sound of a jet engine starting in an totally silent enviorment.

    Both SuperAudio CD and DVD Audio are basically a rip off. To date, the most impressive 5.1 channel demos I've heard, of better than CD quality systems have been DTS - which works with most existing DVD players and requires only an additional decoder. Virtually all Dolby Digital decoder equipped audio componets also feature DTS.

  • No- actually it cannot get the _corners_ of the square wave but it can get the vertical part substantially more vertical than DVD audio. The rate of voltage change will keep on accelerating right until it has to reverse the rate of change and stop. DVD audio will stop at 48K. Frequency-wise this is not much of an issue but if you think about the amount of transient voltage involved with the sides of the squarewave (which demands, in theory, INFINITE voltage), there's a huge difference here. The Sony system will pack a huge amount more voltage into the sides of the squarewave- how much more I'm not sure but it could be several orders of magnitude more voltage. There's some risk of ringing that follows the edges- but when was the last time you looked at squarewaves produced by CD players? >:)
  • Sorry, nathanh. Almost all of what you have said has been very accurate. But your aliasing calculation isn't: aliased freqs appear at the _sampling_ rate minus the freq in question, not at the frequency minus the Nyquist as you claim.

    Therefore the original poster was correct about his 24kHz signal. It will appear at 44.1k-24k=21kHz.
  • The 5k pulse wave you are talking about contains many many high-frequency sine waves. That is what makes it sound "clicky". Because you're a computer guy, you are used to thinking about square waves. But square waves are not a good way of thinking about these problems.

    All I can say is, signal processing research has been going on for a very long time (over 100 years) and has been participated in by a lot of very smart people. So before you start calling its foundations "mostly crap" I'd recommend you do a bit of reading.

    It's clear from your talk of "real-world frequencies" that you haven't even the slightest notion of signal processing, and furthermore you're not even reading the replies in this thread (there are at least two messages by other people explaining what's wrong with the "square wave" idea more thoroughly than I did).

    If you want a simple illustration of why your basis of thought is faulty, here's a question for you: What is the frequency content of a single pulse in the midst of silence? That is, take your pulse wave of whatever period you want, and stretch it out and out until you just have one pulse. What is the frequency content there? You might say "zero" but this is not even close to correct. There is sound after all.

    -Jonathan.
  • The actual criterion to measure music reproduction against is not the theoretical best reproduction of the recording. What does that mean, anyway? You want to hear what was going on in a modern recording studio? No, you want to hear music. Music is dynamics, both large-scale, as in how loud is loud, and how soft is soft, and small-scale, as in reproducing the "attack" in a single note. All this talk of frequencies beyond the range of normally measurable human hearing has a slight effect on this at best.

    CDs are great becuase they removed a transducer from one end of the playback process: no need for a needle or a tape head. Now you get the signal, and, most importantly, the dynamics of the signal nearly perfectly. Now you need speakers that can take this dynamic range to your ears.

    Most speakers, especially most small speakers, fail to do this. They may be "accurate" in terms of response to the range of frequencies put through them, But that won't reproduce the performance, which is in the dynamics. You can A/B test them endlessly against each other, and find interesting and subtle differences, but here is the real test: Take a pianist and a real concert grand piano, and some recordings of a piano, and see if the speakers you think are perfect can fool you into thinking they are the piano. Usually, there is no contest. Any idiot half deaf from a thrash metal concert could tell the difference, because the piano puts so much energy into the air that very few loudspeakers can come close. The piano shakes the floor, makes the windows rattle, and you can feel it in your bones. By contrast, even really good speakers make it sound as if the lid is shut and there is a pile of coats on top of the paino. There is no way a small speaker can do what a piano, cello, bass fiddle, tympani, baritone horn, etc. can do. Put them in a concert hall together, and you have a real challenge.

    I would much rather have a pair of Klipschorns (if I only had a room with corners) than a pair of similarly priced near-audiophile conventional speakers even though the K-horns would no doubt be hideously less linear. They would be efficient enough to come close to reproducing the dynamics of real musical instruments. The fact is, no two pianos have the same response up and down the scale, nor the same resonances either, nor do two rooms, so why worry about getting close to absolute linearity? The same argument holds even more as music gets more complex: An orchestra can be close miced, or not, recorded in long takes, or cut and pasted from small snippets, multitracked or not, etc. All those engineering decisions make absolute reproduction a joke. Reproduction of what? I'd rather hear how hard those bows are coming down on those strings. That is where the information is.

  • Sony will introduce one format, and Panasonic will introduce another. Sony has their "Memory Stick", so Panasonic introduced "SD Memory". DVD-RAM, DVD-RW, DVD+RW, etc.

    They are not doing it to actually innovate, but to make money from licencing the technology. CD is an old enough technology that the patents are either due to expire, or have already expired. So they have to introduce some new patented technology so they can keep that revenue stream going. Remember that they have introduced several stupid formats (anyone remember the El-cassette? Philips' DCC?) for every one that succeeds.

    If they goal was simply to make better audio available, they would be releasing regular DVDs without video tracks. 5.1 24-bit 96khz. No, instead they want you to buy a whole new machine that essentially does they same thing, except is broken by disabling the digital output! Seriously, both DVD-Audio and SuperAudio CD do not have any way to output anything other than multiple analog audio channels. Mega-stupid. Their fear of people copying their tracks has rendered both formats worthless. The worst thing to happen to Sony was when they purhased Columbia.

  • Sorry, yes. The previous poster wrote "22050 + 150 = 24000". I was pointing out that "22050 + 1950 = 24000". I should have completed the entire line of reasoning rather than just point out the mistake.

  • What is a 15K square wave?

    Pulses of infinitely intense sound pressure changes occurring 30,000 times a second.

    What is a 15K sine wave?

    The gentlest possible oscillation of sound pressure, 30,000 times a second.

    Now, maybe your tweeters aren't up to putting out infinite sound pressure levels- technically the nature of a square wave is that no matter how faint it is you need to put infinite force into the transients in order to keep it square. Most people's speakers are not capable of putting out such intense pulses, cleanly or not. Many might be so bad that they can't produce the pulses at all- in which case the speaker is producing a sine wave, and no wonder you can't hear a difference.

    There are, however, plenty of speakers out there that will try a lot harder to render the edges of that 15K square wave. Electrostatics. Fancy tweeters. Horn-loaded designs such as Klipsch uses. I'm currently using very fragile inverted domes with a variation on horn loading; I suggest that my speakers can produce louder sounds in the supersonic range than yours can. Since square waves are made up of transients of theoretically infinite force, it matters whether the speaker is able to produce loud sounds that high up- it translates to ability to make the square wave _be_ a square wave acoustically.

    I've tried this sort of thing. A sine wave up in that range is like a clear blue light (pardon the synesthetic imagery but I'm not sure how to explain this) where a square of the same frequency is a lot whiter and sounds _discrete_ somehow- it's sort of like a super-high-frequency-sound equivalent of a strobe light and doesn't sound _smooth_. Instead it makes you squint and kind of hurts, the edge goes right through your skull. Neither note is very musical, or can be stood for long :)

    Amusingly, sampling at 44.1K cannot properly capture _either_ because there's an intermodulation distortion that is waaaay beyond what people consider acceptable for total harmonic distortion. Now if you were talking about a 14700 hz note, you'd get a lot closer as that gives you three samples per cycle- either way you are producing a pulse wave at 1/3 width but at least it doesn't warble :)

  • by Chris Johnson ( 580 ) on Saturday October 14, 2000 @11:54PM (#705227) Homepage Journal
    Intermodulation distortion.

    Even a signal as solidly within the passband as a 14.7K _sine_ wave will completely fail to be recovered exactly. This wave gets three samples per cycle to capture a symmetrical wave. The only possible result is a pulse wave of 1/3 pulse width- that's what's in the data, there is no other possible result. When you apply a theoretically perfect brick-wall filter to that and perfectly get rid of the sampling artifacts YOU STILL HAVE IRREGULARITIES. Substantial ones, by audio measuring standards- many percent.

    If you do a 14.6 sine wave, not only do you get basically a 1/3 width pulse wave, but you get a subcarrier.

    Can't you _see_ this?? Doesn't your math acknowledge this? These are not only measurable distortions but the problem is still present even _completely_ in the theoretical realm.

    Are you arguing that a 14.7K sine sampled at 44.1K is a symmetrical waveform? Or that a 1/3 width pulse wave at 14.7K with a 22K filter is EXACTLY a sine wave? I would suggest that it is not...

  • You're not wrong here: there are no cheap and commercially available 24-bit dacs, and noones hi-fi comes even close to 24-bit fidelity from source to speaker, but you're missing my point.

    Just as in playback where the 44.1kHz signal is oversampled, during recording the signal is often recorded at much higher frequencies. The extra data samples allows you to reduce error and this lets you increase the sample size. It isn't a direct measurement that gives them 24 bits per sample. It's an indirect calculation based on nearby samples with a realistic 20-bit ADC (or whatever studios use these days).

    Now it doesn't matter that at no stage will a 24-bit DAC or ADC be used. You get 24-bits of information per sample into a 44.1kHz recording by sampling at a higher frequency. You can use the extra information during playback because your sample interpolation will be more accurate.

    It's a matter of information. Increasing the stored sample size increases the amount of info on the disc. More information lets you reproduce the original signal more accurately. Ignore the practicality of the stored info: 24-bit isn't a very practical sample size, but it is more info, and the recording studio will have used clever techniques to ensure that they are using all 24 bits even if they don't have 24 bit ADCs.

  • Um- to be specific, it should approach perfect accuracy for all audio frequencies- and as it starts to pass a megahertz (I'm guessing as it starts to pass 500 kilohertz) you gradually go beyond 100% distortion, until you have thousands of percent distortion at a couple megahertz :)

    It's a lossy format in a peculiar way- this particular method is wildly more accurate at low frequencies than high ones. The thing is, 'high' is relative >;) to this format, 100K is 'low'. The really low frequencies are almost arbitrarily accurate- and again, the ability to delineate high-energy transients is many orders of magnitude better than bandlimited formats. I must admit I am not terribly worried about 10,000% harmonic distortion... at two megahertz. I don't believe I can hear even the most outrageously powerful sound waves at 2 megahertz. Would those be microwaves? Maybe these new players will cook people in front of them if you turn them up loud enough (and have tweeters that can put out loud 2 million hertz signals ;) )

  • The problem with all digital recording methods is the DAC - the digital to analog converter. The real virtue of a higher sampling rate is that it makes the DAC job much easier to do accurately.

    CD's do a great job on s/n ratio - they don't do such a great job of accurately reproducing a wave form. There is a clearly audible difference between CD's and Analog records on top quality audio equipment.

    Even the best audio equipment does a really poor job of reproducing sounds. The proof is to make a recording in an anechoic chamber of a speakers output and then compare that signal with a synched up input signal on an oscilloscope set up in differential mode. If the two were identical the output would look like a straight line, instead it looks like a bowl of spaghetti; the phase and amplitude distortions are terrible.

    I know that there are tests that 'show' that the ear is not sensitive to phase differences, but those tests were badly flawed; the phase distortion of the reproducing equipment was so bad that of course no one could hear differences. One phase distortion sounded just as bad as another. Most of the differences in speaker sounds have to do with the phase response of the speakers.

    The ear is basically a Fourier analyzer, and phase does matter when you do Fourier transforms.

  • You're 100% right. 16 bits is not enough for mastering and mixing. The intermediate values of the mixing must be preserved as accurately as possible to ensure the best possible final result. When the final output is obtained, 16 bits may be enough for listening to. I don't know where your statement about an undigitized output and the 16 bit quantized output is coming from though. I don't think anyone has ever said that the result of 16 bit sampling can exactly reproduce the undigitized output. But are you asserting that the difference is a really big audible difference?

    As for the 44.1kHz? If the audio is already sampled at 44.1kHz, you'd better have a serious cutoff at 22kHz or else you're getting alot more distortion at higher frequencies. What exactly are you talking about when you say that amp designers go for pass band into the MHz? If there is any power above 22kHz it is purely noise (that is if the input is a CD, LP is arguable at this point). Now is the higher sampling frequency better? Of course! Just like the 24bit sampling, it makes it easier to reproduce high quanlity recordings.

    You seem to be quite an audiophile. I am not. But I wonder, have you ever really tested the validity of your gut feeling about these technologies. I feel that your statements are an amalgam of audiophile "common knowledge" but don't represent the physics and mathematics of sounds and frequency analysis.
  • High fidelity audio _sells_ better. From Dark Side Of The Moon to Led Zeppelin 4, the audio that's been best balanced, most extended in frequency (note: doesn't mean just 'boosted frequency extremes' but genuine extension for a broader passband) and cleanest in the time domain (no ringy EQ crud) has _sold_ better than its competition.

    Ever heard of a book called 'The Hidden Persuaders' by Vance Packard? (expose of the advertising industry) This book exposed how advertising people were able to persuade consumers to buy one thing rather than another. With music, the audio quality is the 'hidden persuader'.

  • Question: What would make all the audiophiles happy for frequency ranges of a cd?

    Answer: quintuple the price, trace the edges with a green magic marker [snopes.com], slap a bunch of pseudoscientific gibberish on the front and then congratulate them on having a more "well-trained ear" than the rest of those damn hoi-polloi.

    The best way to deal with "audiophiles" is to consider them a form of free entertainment, and proof that cocaine isn't god's only way of telling you that you have too much money.

  • Vinyl does sound better than a CD. I'm not talking about the S/N ratio, either. Sure, CDs have better silence than vinyl and they generally don't warp or pop. But because they're higher-resolution than a CD you not only get the very high frequencies that you wouldn't think matter. You also get better reproduction of complex harmonics. This is what all that headroom 100KHz frequency response gets you. And the sampling rate that leads to it allowss for a higher S/N ratio, too.

    It must be nice to be unable to hear the difference between a 320k MP3 and a CD on your "$3000 sound system", because I can hear the difference on the $80 speakers connected to my PC and on my $800 "sound system". I'm not exactly an audiophile. CDs are good enough that I almost never buy vinyl anymore, but part of that comes from the added convenience of the CD. It's playable on portables and in a car, it's easier to carry, and so forth.

    But every time I put on some vinyl, even if it's technopop or garage rock, the room warms up.
  • No, a square wave is a square wave is a square wave. Sine waves operate on entirely different principles. A true sine wave is the result of a trigeometry identity - sine. They are not an "infinite number of sine waves" rolled into one. All you've done here is a nice graphing trick - just like a good way of representing pi is 22/7.

    Try this: Take a 22kHz square wave, and run it through a low-pass filter with a cutoff frequency that's slightly higher -- say, 25kHz. Look at the output on an oscilloscope.

    What will you see?

    You'll see a somewhat distorted sine wave, that's what you'll see.

    If a square wave is in fact made up of a bunch of sine waves, it is easy to explain why this happens. The filter has allowed the fundamental frequency to pass, and has attenuated the higher harmonics. The distortion results from the fact that the filter is not perfect, and will allow some of the first few harmonics to pass.

    On the other hand, if indeed a squarewave is something entirely different, how do you explain this result?

    In other words, you have just denied a theorem which underlies much of modern signal processing and communications. You better have damn good proof of your position!

    You are confusing bits/second with sampling rate. To capture and reproduce a sine wave, you need only sample at double the highest frequency. ie, to capture 22khz or lower, you sample at 44khz. This is the Nyquist Criteria, which I believe may have been mentioned earlier in this thread. The formula is somewhat complex to write out here, so please visit these guys for the formula

    No, he's right. If you want to get a reasonably accurate digital representation of a squarewave, you need to sample at a higher rate. Why? In order to capture the higher harmonics present in a squarewave, that's why. If you sample at a high enough rate, you will capture several higher harmonics, which will combine in order to approximate a square wave.

    Oh, but you don't seem to accept Fourier's theorem. Well, never mind then.
  • It's been demonstrated again and again that for everything other than absolute audiophile gear (>$5000US systems, >$500 headphones) the limiting factor isn't the signal coming out of the CD player (unless it's a $50 special), it's the quality of the amplifiers, cables, speakers, and particularly the acoustics of the listening environment.

    Get yourself a pair of really good speakers and amplifier and *then* worry about the quality of your CD player.

  • What is a 15K square wave? Pulses of infinitely intense sound pressure changes occurring 30,000 times a second.

    It's a sum of multiple sine waves. The 15kHz square wave would have a natural harmonic at 15kHz and a second harmonic at 45kHz at 1/3 the amplitude and a third harmonic at 75kHz at 1/5 the amplitude and so on.

    You can't hear any of the harmonics except the natural harmonic at 15kHz. Don't pretend you can. Don't pretend anybody else can. No human has proven in an A/B test that they can hear any frequencies equal to or above 45kHz.

    What is a 15K sine wave? The gentlest possible oscillation of sound pressure, 30,000 times a second.

    A single sine wave at 15kHz. You'll hear this.

    Now, maybe your tweeters aren't up to putting out infinite sound pressure levels- technically the nature of a square wave is that no matter how faint it is you need to put infinite force into the transients in order to keep it square.

    No amplifier would ever pass frequencies above 22kHz to the speakers. High frequencies like this could easily damage crossovers, speakers, etc. It would certainly damage the output stage of the amplifier.

    There are, however, plenty of speakers out there that will try a lot harder to render the edges of that 15K square wave. Electrostatics. Fancy tweeters. Horn-loaded designs such as Klipsch uses. I'm currently using very fragile inverted domes with a variation on horn loading; I suggest that my speakers can produce louder sounds in the supersonic range than yours can. Since square waves are made up of transients of theoretically infinite force, it matters whether the speaker is able to produce loud sounds that high up- it translates to ability to make the square wave _be_ a square wave acoustically.

    The thing is, we people who know mathematics know that the only frequency in a square wave below 20kHz is the 15kHz sine wave. The nearest frequency to that 15kHz sine wave is at 45kHz.

    You claim you can hear this 45kHz harmonic. Either you're hearing distortion (which I'm sure an audiophile would never admit to) or you're full of shit. I'm willing to bet on the latter.

    Amusingly, sampling at 44.1K cannot properly capture _either_ because there's an intermodulation distortion that is waaaay beyond what people consider acceptable for total harmonic distortion. Now if you were talking about a 14700 hz note, you'd get a lot closer as that gives you three samples per cycle- either way you are producing a pulse wave at 1/3 width but at least it doesn't warble :)

    What a load of techno-babble! You don't know what you're talking about.

  • What quantisation? It's an analog media!

    Irrelevant. Quantisation is the accuracy to which you can represent an intensity level. Vinyl has a limitation just like any recording medium.

    You measure it indirectly with vinyl, using SNR readings from a single sinusoid.


  • I only hope that this provides a clear, easy way to prove to "the masses" how much better things can sound at high bitrate.

    Just like with HDTV, where the networks want to use high bandwidth channels to broadcast 4+ SDTV signals, I constantly fear that something like 128k MP3 will become "a standard" because it is "good enough".
  • Yeah.. that works too. Try combining the two! (I mean, a good stereo, AND good dope, not pink floyd & the dead.)

    After posting last night, I dug into the subject a bit.

    It turns out that many modern CDs are *very* badly mastered, in that they do not use the dynamic range available to them. They maximize everything to the loudest volume in order to get the loudest radio play. So... as a result, many modern stereos are configured (by the users) to listen to everyday pop music... and unfortunately, if something that has real dynamic range to it is used, you'll just miss most of it.
  • Most people wont be able to hear a big difference right off the bat, and in fact, humans do not HEAR much out of the range of current CD-audio, but people can feel the difference without registering a sound. That ultra-clear, ultra-high note wont "sound" but it will make other high notes sound better. Just like 3D visuals, you don't see everything they produce, but what you don't see helps to make what you do see all that much better.

    Its just a feeling and i support this new format, its may be a little to high end for most, but costs will fall as the format matures
  • by account_deleted ( 4530225 ) on Saturday October 14, 2000 @07:19PM (#705275)
    Comment removed based on user account deletion
  • Maybe he just has sand in his ears. Somehow, I think this post sounds more coherent on a first read than it actually is on close analysis.
  • The encouraging thing about this article is that Sony is actually spending money to preserve their audio tape library, as opposed to the motion picture industry, which deliberately destroyed their silent movie archives, allowed their nitrate archives to molder to dust, and sat by as their Eastman color libraries faded away.

  • by n xnezn juber ( 243178 ) on Sunday October 15, 2000 @01:06AM (#705280)
    Ok, Chris... I replied to one of your posts and I thought you at least had a clue of what your were talking about. And then I see this post. I'm sorry but you have no understanding of digital signal processing and fourier series. I mean this whole-heartedly that you are missing the fundmental mathematical concepts to understand why a 14.7kHz sine wave can be perfectly reproduced with 44.1kHz sampling and the appropriate filter.

    For your education I pulled a couple links from the web:

    Site A [euphoria.org] has two pictures of a sine wave being sampled. This web page is totally wrong. They do not understand aliasing... that picture they are showing with the straight lines shows the reason that you need to have a low-pass filter. With the appropriate low-pass filter, there sine wave in the above picture will be reproduced exactly.

    Site B [aston.ac.uk] shows the frequency domain. You're probably seen a similiar plot of the horizontal axis being frequency and vertical axis being magnitude. Don't worry about the math if you don't understand it. Just look at the pictures. The top picture shows the sampling period being less than half the period of the highest frequency in the original signal (the bell shaped thing centered at frequency 0. This is like your 14.7kHz sine wave sampled at 44.1kHz. The second is when the period of sampling T equals half the period of the highest frequency (see how the edges of that waveform exactly touch each other?). The bottom picture shows what happens when the sampling period is greater than half the period of the highest frequency. That portion of the bell shaped thing that overlaps one another is sampling noise. In other words everything that is overlapping is lost.

    Site C [aol.com] is another site that does not understand nyquist's theorem. They are completely thinking in terms of the time domain instead of the frequency domain. Not to mention that they don't realize you always have to low-pass filter a sampled signal.

    Site D [earlevel.com] actually is correct and should be understandable to even the least mathematically inclined.

  • Ok, so, Betamax, Minidiscs, Memory Sticks, SACD. What's the deal with striking out on their own? How many times will they come up with a good format, only to have it ignored? Why do they continue to bother?
  • I agree with you in principle, but I also argue that reproducing sounds beyond human hearing is a valid pursuit for the following reasons (some more serious than others, but all "valid"):

    • Harmonics above human hearing (let's just say above 44khz for argument's sake) still contribute to the human-hearable sound. A multitude of out-of-range harmonics still interefere with the in-range sounds, altering them (cancellation, multiplication) in ways that humans can hear. One might argue that this can occur virtually before encoding at human-hearable levels. This is true if you're encoding with all 3D environmental effect included, designed to be listened to on headphones. If you're playing stereo sound out of speakers, bouncing off of walls, it still matters to reproduce what you can and let it bounce around and interfere.
    • Not all listeners are human. Some people have fun getting their dogs to sing along to music, for example. The reasons certain dogs seem to repetitively pick certain songs might just be because of certain frequencies that humans can't hear on the recording.
    • Historical Preservation. Someone in the year 2200 might be interested in Musical History, and may use these recordings not only for listening pleasure, but also for scientific study into our recording methods. He might even be trying to construct a then-extinct physical music instrument (say a saxophone) from listening to it's harmonic characteristics in a recording. The beyond-hearing harmoics wuold be excessively valuable to him.
    • We may soon (some would say 10 years, some would say 100... who knows) have biologically or electronically altered hearing available to the masses as an "upgrade", allowing us to perceive previously out-of-range frequencies
    Don't forget that most artists today record at CD rates (16/44.1) on DAT because they know they're shooting for a CD. Therefore even the master has no better fidelity. If we raise the bar for the CD, we raise the bar for the original as well.

  • Sorry, but this whole post is like complete BS.

    - The Fourier theorem says (and imho proves) that every periodic signal consists of an arbitrary or even infinite number of sine waves. So a square wave actually IS a composition of the base tone and all odd harmonics.

    - Typically, when humans grow older, the upper end of their listening range goes from about 20KHz to 12KHz at worst. I know enough 20-year-olds which aren't able to hear a PAL TV's beeping (at 15625Hz) anymore.

    - Harmonics aren't sine waves BELOW the base tone, but actually ABOVE them, at 2x, 3x, 4x etc the frequency.

    - The nyquist theorem is correct in a certain way, but it's neglecting the fact that from frequencies at samplerate/10 up, there is enough aliasing to severely disturb the signal. Go try sample a 22051Hz sine wave with 44100Hz.

    - What's that complete nonsense with 32bit/sec? A CD player reads at 176400bytes/sec (2x44100x16bit). With 16bit resolution, the quantisation error and thus the SNR is at about -96dB. Go look up what "dB" means next time, please.

    - it's interesting that you first claim everybody knows what "digital" means and then compare digital signal transmission with your oh-so-cool car stereo. Digital audio signals are transmitted with about 2-3 MBit and can even pass some 10 metres of simple cinch audio cable without any loss. Flipped bits are very improbable and timing jitters are completely neglectable, as they get flattened out by the receiver's internal clock (if the equipment is good, that is). The only situation in which jitters are important is mixing of arbitrary digital audio signals which all have their own clock source - and normally, this does not happen outside studios.

    So, better do some research next time or ask the people you refer to all the time before throwing around words and expressions you obviously do not understand.

    Thanks,
    kb

  • I don't follow your reasoning. By your definition, pretty much every thing every corporation does is consistent with Open Source philosophy.

    Observe: Bill Gates had an itch (getting lots and lots of money), scratched it and produced many valuable products (as well as many other pieces of crap products) as a side-effect.

  • Half the point of SACDs is that they're backwards-compatible, right? Wrong. According to my boss at the record label where I worked over the summer, some of the SACDs we distributed were not backwards-compatible. As far as I know, there's no way to know before you plop the disc in your player whether it'll work on your old-school CD player, either.
  • even if a mic would pick up sound above 22kHz - a special one - there is distortion. the higher the frequencies the more distortion.

    but now you can record it :-)

    higher bit depth is a positive thing, but then again *who* really hears the difference, even the equipment which plays that is too expensive ...
  • This is all BS. The most important reason why a lot of professional audio gear uses higher bit depths than 16 bits and higher sampling rates than 44.1/48 kHz is so that the equipment makers can convince people to spend a lot of money upgrading their old equipment to newer and ostensibly better stuff. It may come in useful if you're going to be recording something with a lot of dynamic range, as it will leave you a lot of headroom to handle transients in the music so you can record low without being afraid that the input is going to clip (very ugly in the digital domain), but for the final mix, even 16 bits is overkill.

    The whole scam about getting people to buy better equipment applies not just to the makers of pro audio gear, but to folks like Sony trying to convince suckers to shell out big bucks for a "better" CD player. If you're listening to music on a $65,000 pair of Wilson Grand SLAMMs, driven by some Martin-Logan class A amps, all in a specially-designed acoustically perfect listening room, and you're listening to chamber music recorded in an anechoic chamber which Schoeps mics, godlike preamps, and godlike A/D, and your ears are 30 years old or younger and you live and die by what your stereo sounds like, then maybe you can talk to me about 24 bits or 96 kHz sampling rates, but if not all of the above are true, forget it. The existing CD standard is way better than everything else in the audio chain, from performer to listener, for 99.9% of the audio equipment, performers, recording engineers, listeners, and types of music (Britney Spears in 24/96?) in existence.

    I'm not just blowin' smoke here. I do mobile audio recording professionally (Earthworks mics, Earthworks pre, Apogee A/D, record straight to stereo DAT -- see my URL), and it's standard practice for me to record things 3-6 dB lower than peak to leave breathing room, and in post-processing, design the sound to use no more than 30-40 dB of dynamic range nearly all the time (as making a recording with more dynamic range than that makes it very difficult to listen to unless you're in a very quiet room over high-end headphones). I'm only using a fraction of what lil' ol' CD is capable of, but anything more would just capture room rumble and people's breathing in that much more detail.

    If you believe that current audio is somehow shortchanging you from hearing that ineffable "something" that will make your music sound better, then give up the fake science in Sony's product descriptions, save your money on fancy CD players, and go out and buy some better speakers than those Radio Shacks that your brother Jim gave you when he went off to college.

    -----

  • Eh, there's lots of other companies doing that. Wadia was certainly not the only one. You just won't find the products at Radio Shack ;) and I daresay you won't find them cheap even now- I just looked on eBay and the prices I was seeing were $200, $500 and $1500 in US dollars.
  • Leaving aside the bit depth (I keep trying to explain that it's not about maximum contrast but the quantization inherent in _linear_ reduction of a continuous range to discrete increments)...

    What the _hell_ do you think I have for mains, Yamaha NS-10s? The _only_ way to get remotely serious performance is either to spend loads of money or educate yourself extensively and rebuild/custom build _everything_.

    There is a hard limit to the low frequency extension available from ported enclosures: that's why my speakers have variovents. There's another hard limit on how much acoustic power you can produce with a given cone size within the linear excursion limits of the voice coil (_not_ the suspension physical limit, but keeping the voice coil within the magnet gap). That's why my speakers run 12/10/8/6.5 drivers in a series/parallel configuration as if they were a guitar speaker cab- all the drivers contribute to the piston area for subsonic content. There's a softer limit to enclosure size- I don't feel like running an active EQ system like the very clever Bag End 'ELF' system, so that's why my cabs are four feet tall, over three feet deep and as much as 15 inches wide. They'd make damn good stage bass cabinets, though they're not really designed for those wattages. That fixes the lowest frequency at... let me put it this way. I own synthesizers. I can _put_ subsonic tones through my full system, and stuff will be falling over. There's a limit to how much acoustic pressure even that much cone area can produce, but it's not giving up at 20hz, that's for damn sure.

    The other extreme? Composite drivers using piezo elements to handle the extreme highs. (no, not a whizzer cone with a clunky piezo disc glued onto the back and flopping around in a plastic housing). This type of element is easily capable of extreme frequency content- this is what they use to make ultrasonic devices, piezos. I have inverted aluminum domes on these which are so light and delicate you could damage one by blowing hard at it. To top all this off these are effectively HORN LOADED- not in an extreme fashion but the waveguide is definitely reminiscent of an exponential horn, and that means LOUD. The supersonic acoustic power these can produce is not subtle- it will give you a nasty headache very quickly if there's a lot of supersonic ringiness or grunge in the signal.

    I don't know _where_ these people are coming from, but there seems to be an infinite supply. All I can say is- I hope you folks are my competition for sound engineering work, because the more fervently you insist that none of that high endy stuff even matters, the less use you will be at a task like mastering. This stuff is not hypothetical: there is practical application provided you work in the business. Ability to monitor a signal over an audio range substantially beyond 20-20K means better ability to control the sound balance and integrate the various instruments into a coherent whole, never mind that in certain areas like bass there are whole subcultures (car audio!) dedicated to showing off the ability to push this limit (by the same token, electrostatic speaker fans are declaring their interest in pushing the opposite limit).

    Indeed I think about bandwidth and- S/N ratio is a rotten way to express this, let's call it _linearity_. However, the system _will_ happily go above 20K- and the linearity is considerably in excess of 16-bit quantization. And so is my digital source- I swear by Alesis 20-bit 48K ADAT. The highs could be better but the linearity is really quite good and I'm not dead certain a full 24 bits is necessary. However, the difference between 20 bit and 16 bit is major- and the difference between 44.1 and 48K is minor but every little bit helps. I was testing some new tweeter design changes on Frank Zappa's "One Size Fits All" album- the Kerry McNab-engineered Zappa albums push the high end _hard_ and the mikes turn a really large amount of the 40% supersonic content of cymbals into voltage. Playing this back with the new tweeter configuration really drove home how much we've lost: that album is very good at reproducing the natural supersonic balance of the sounds it contains, and a CD cannot contain even distorted versions of this acoustic content- the LP tends to distort it but the CD throws it away completely. So, again- 44.1K (really significantly below 22K) isn't enough. 16 bits aren't enough either.

  • In one sense you got me there: a wave at 14.7K with a 22K filter is exactly a sine wave. I'll retract anything I said about it not being one- I spoke too quick, I was mistaken.

    I _will_ note that the wave is phase shifted compared to the original source- in fact when I talk about intermodulation distortion what's happening is that the wave gets phase shifted forward and back very rapidly.

    I would _suggest_ that a wave which is phase shifted against other sounds is not the _same_ wave that is sampled. It can be _a_ sine wave but if it's not in the same phase as the source (compared to other music data) is it the same wave?

    It's possible that given 4X oversampling Nyquist is as near to correct as necessary. Certainly if you're not concerned with phase but only the presence of frequency components 2X Nyquist is basically right. I draw the line at saying it is theoretically perfect- I'd say it could be a very good approximation. I don't understand why so many people arguing for Nyquist take on a positively religious tone, demanding the acceptance of articles of faith. 'Theoretically perfect' is a very strong statement, and the context (such as 'bandlimited, and ignoring phase and time information completely') needs to be clarified or people take it to mean 'therefore CDs are perfect reproducers of sound': which is not accepted by any competent sound engineer today, since again 'perfect' is a very strong term.

  • First off MP3's sound like crap...

    As far as how to sell this to the mass market. I think the DVD spec provides a means to really add value to the music disc.

    The disc could contain pictures, live video, it could even be recorded in 5 channel.

    That's what I suspect we'll see.

    It's basically like the CD-Extra format, only better.
  • Actually, one of the neat things about CDs and digital sound in general is that it allows for totally-flat-to-0-hz low frequencies. This is actually a very big deal- stereo equipment tends to develop a sort of thick boomy quality around the low cutoff frequency (not just bass reflex speakers, even amplifiers show this effect a little bit). One of the primary achilles' heels of the LP playback format is that you're at the mercy of the tonearm resonances and cannot even attempt to track anything like fractional hz- if you do, the needle just skips ;) it takes quite unusual design to build tonearms that will be compliant at fractional hz but not flabby wobbly things at say 20-30 hz.

    It is _very_ nice that digital allows low frequencies down to 0 hz. I will never knock even CD sound quality for the ability to convey extremely low sounds- the only limitation is the analog stages that drive your stereo. I produce sonograms of my recordings sometimes, and they easily produce frequency information at 9 hz and lower (I refuse to use bass drum sounds that can't be made to have substantial subsonic content :) ). CD can easily encode _substantially_ below 20 hz. There's basically no limit. The format is flat to 0 hz.

  • Sounds of up to half the sampling rate can be reproduced *exactly* by (ideal) audio equipment.
    This is what the Nyquist theorem says. The thing you are saying about square waves is a misunderstanding on your part.

    The frequencies we are talking about are all sine waves. So the shape to be reconstructed is predetermined and known clearly. There is no "interpolation". You may be familiar with the Fourier transform, which takes advantage of the fact that all sounds can be represented as compositions of sine waves.

    All that said, I think that the idea of higher fidelity is a good one. Because there are two things that happen on CDs as they stand now:

    (1) Companding
    (2) Quantization

    A brief and kinda-misleading explanation of "companding" is that the bottom hundred or so Hertz of music on a CD is chopped out. This is under the theory that the human ear can't hear this stuff, but lots of people say they can. And even if you can't consciously notice these tones in isolation, it's not much of a step to believe that they contribute to the "richness" of a sound when mixed in.

    Quantization is another matter. Say you have 44,100 samples per second stored on a CD. Each of those samples represents the intensity of the sound at each point in time. That intensity is a real number (an analog value) and it's being stored as an integer (a digital value). In mapping from the analog to the digital value, you lose precision (think of rounding a float to an
    int). The more bits you have to represent each value, the less distorted the sound is. Think
    of the difference between 16-bit truecolor and 24-bit truecolor when viewing photographs and whatnot.

    I think added resolution on disc-based music would be really nice. Whether it comes through audio DVD, or a different CD format, though, I don't really care. I think that DVD would be nicer for obvious reasons of storage capacity.

    -J.
  • Wow. Where to start...

    16 bits isn't enough. That's _really_ obvious at this point- no professional works in 16 bits except for the final CD output.

    And the reason is... Because real-world signals have a lot more dynamic range than we can fit on the final output. Stick your ear into the bell of a trumpet or 2 inches under a snare drum and tell me how much hearing you have left after an hour. Yet you find microphones there all the time. When recording digitally, it's critical not to exceed the peak amplitude -- digital clipping is obvious and nasty. So you leave plenty of room at the top for extra-loud sounds, and adjust the volume when mixing it down. That means you want more than 16 bits to get 16 bits of resolution in the final product.

    even if you mix with ideal noiseless coloration-less electronics

    ARGH!! Digital mixing is addition. There's no noise or coloration to be added. If you want to add some, you have to add it on purpose.

    44.1K isn't enough either.

    Actually, this may be true. Some studies have shown pretty convincingly that a few people can detect the presence of tones close to 30 kHz. Whether such tones exist at detectable levels in actual music isn't clear. But the science does allow the possibility that a 60 or 70 kHz sample rate may be necessary. However...

    High end amplifier designers go to great lengths to get their pass-bands up into the megahertz

    Utter nonsense. Deliberately extending your passband that high would not only seriously compromise performance at audio frequencies, but it would make your amp incredibly sensitive to RF interference.

    The neat thing about the bit rate is that it's effectively infinite bit rate

    Not at all. What they describe is delta-sigma modulation, the same thing used in nearly all CD players today. From what I can tell, their scheme is simply a higher bit-rate delta-sigma system with the storage medium containing the 1-bit stream. There is absolutely nothing new about any of this.

    There doesn't need to be _any_ brickwall filter on the output

    Precisely. And this is in fact the very reason that delta-sigma modulation is used on commercial CD players (and most of the A/D converters used in the industry, as well). Again, the difference here (AFAICT) is that the 1-bit stream is what gets stored, rather than resampling to the 44.1/16 CD standard.

    the potential slew rate of this technology is just astronomical

    This has absolutely nothing to do with slew rate. Nothing at all. And slew rate has little to do with how loud something sounds. It is true that absurdly powerful amplifiers can sound better than smaller amps with otherwise equal specs. And you almost have it right; the reason is that the peaks require a lot more power than the average, and if you turn up the volume, the smaller amp can't produce the peak power needed for the transients. The result is clipping, which is audible.

    However, again this has nothing to do with the storage medium. Remember, modern CDs are almost all recorded and played back using delta-sigma modulation. Sony's plan is to simply skip the current step where the bitstream is reorganized to 16 bits at 44.1 kHz. This allows them to choose a bit rate with a higher dynamic range and cutoff frequency. I leave the arguments about the need for those to other postings.

  • If you put the output of the DAC through a 20 kHz low pass filter, the odd-order harmonics are removed and you have sine waves again.
  • by DeeKayWon ( 155842 ) on Saturday October 14, 2000 @07:32PM (#705324)
    http://www.mpeglabs.com/dvd/dvdaud io/ sacd.htm [mpeglabs.com]

    http://www.sonymusic.com/sacd/ [sonymusic.com]

    http://www.superaudio-cd.com/ [superaudio-cd.com]

    Something that doesn't gush like a press release [time.com]

    Another more objective link [ambisonic.net]

    It appears they're using a dual layer method for backwards compatibility. The details about copy protection methods are vague, but they do mention visible and invisible watermarks aimed against both pirates and counterfeiters. But I can't seem to find a decent explanation of how the encoding DSD encoding scheme works.

  • Actually, a square wave at Xhz consists of an infinite number of sin waves One of which has the frequency X, and the others having frequencies 3X 5X 7X and so on

    No, a square wave is a square wave is a square wave. Sine waves operate on entirely different principles. A true sine wave is the result of a trigeometry identity - sine. They are not an "infinite number of sine waves" rolled into one. All you've done here is a nice graphing trick - just like a good way of representing pi is 22/7.

    People can hear frequencies up to something in the 20Khz range...

    No, typically only young children can hear into the 20khz range. Most adults aged 21 or older are only capable of hearing to about 18.5 - 19.5Khz.

    ...that means they can hear the components of a signal whose frequencies are lower than that.

    Yes, they're called harmonics, and since the human ear cannot hear beyond about 20khz, it doesn't matter if an instrument has a fundamental frequency higher than that, as the ear is picking up the harmonics at 1/2, 1/4, 1/8th, etc of that original frequency. That is what your equipment needs to be able to reproduce, not the fundamental.

    So yeas, to represent a square wave at 22Khz fairly well you'd need to sample at atleast 100Khz, preferably 200Khz.

    You are confusing bits/second with sampling rate. To capture and reproduce a sine wave, you need only sample at double the highest frequency. ie, to capture 22khz or lower, you sample at 44khz. This is the Nyquist Criteria, which I believe may have been mentioned earlier in this thread. The formula is somewhat complex to write out here, so please visit these guys [digital-recordings.com] for the formula.

    Now, I suspect you were talking about the bits/second, so I'd like to get into that for a minute, since this is likely where the large numbers came from. A CD-ROM typically encodes at 32 bits/second, per channel. The thing is, whenever you convert from analog to digital, you do lose a finite amount of information. This is expressed as a "quantization error". It can be up to +/- 0.5 dB between the original signal, and the real signal. The smaller the quantization error, the better the digital signal represents the original (analog) signal. This isn't important to go into, beyond to understand that too low of a bitrate means a greater error rate - the signal will be skewed, even if the sampling rate is sufficiently high.

    Most "audiophiles" are idiots.

    Most "audiophiles" tend to educate themselves on what all of what I just said means. They know about signal degregation, they know about harmonic distortion, intermodulation distortion, etc. Why? Because that is their hobby - and like any good hobbiest they're going to read up on the issue. This is in sharp contrast to people who merely want something that goes thump-thump to impress their friends. Those people are merely interested in music - but hardly an enthusiast. The number of parallels between car audiophiles and computer geeks is uncanny, having been pigeonholed by others as both, I can safely say this. They take their hobby/craft/profession just as seriously as /.'r that looked for a 300A from the Brazil fab.

    For instance most audiophiles do not understand what the term "digital" means.

    Excuse me? Everybody who doesn't live under a rock knows what digital means - little ones and zeros, little bits of data. With all the hype about the internet and e-commerce, the idea that someone might NOT have heard the word "digital" in today's world is preposterous.

    It's entirely pointless spending money on more expensive and "better" cables and pickup-assemblies to read and transmit the ones and zeroes.

    So can I get away with using CAT3 wiring when I need to use CAT5 wiring in my network? This aside, had you any experience in the installation of audio systems, you would know that as soon as it leaves the head (the thing you put the CD in), it is an analog signal. At this point, Shannon's Law takes over and signal strength and attentuation become supremely important. Also, because it is an electrical signal - digital or analog, it is suseptible to interference. I take it you haven't ever competed in IASCA or IDBL then. Let me make a case in point about my own experience with high end car audio.

    I have a setup that uses 1000 watts of power, drawing 83.3 amps on 4 ga wire capable of handling 85 amps. First point, cabling is critical to be able to handle your load, from one end to another.

    My head unit than takes the signal and puts it out through several pre-outs. This signal is than passed from the head unit to the amplifiers. The amplifiers do their bit and than send the signal to the speakers. At the power levels I am running at (1000 watts), any weak point in the system would invariably become an immediate issue. I had at one point, cheap $45 cables running from the deck to the amplifiers, and the amount of /noise/ in the system was absolutely unacceptable. I /fixed/ the problem after replacing the cabling with much higher quality cabling.

    1st, A heavy duty cable makes a world of difference, and can be detected in a heartbeat. Do you think a lowend cable can ever take the place of a quality componenet? Think about it, would you use a cheap cable for your brand new ultra 160 raid 5 array? Are you going to run generic cat 5 for your network backbone? Why on earth do you think that audio issues are going to be any fundamentally different than computer issues. A cable is a cable, and quality makes a difference regardless of the application it used for! A cheap cable is more prone to attenuation and distortion at higher frequencies, regardless of what it is used for.

    A better cable provides additional ground, which means much better shielding and allows a signal to go through with less distortion.

    Purchasing high quality speakers and not using high quality speaker wire nullifies your investment in high quality speakers.

    While I certainly don't dispute there are people who just want something loud that goes thump, these are the people that tend to get put in their place if the actually enter a competition.

    Now if you really want to press your point that cabling does not make a difference, let me suggest another slashdot like forum for you. The friendly people over at SoundDomain.com [sounddomain.com] will be more than happy to discuss all those little details with you.

  • What about phase? How do you differentiate between a wave that is out of phase with the samples (in the extreme looking like a zero-amplitude wave) from a low aplitude in-phase wave.

    ascii-art (ignore ".", used 'cause crappy slashdot doesn't have pre tag or nbsp:

    |...---...............
    |.//...\\.............
    |/.......\ ...........
    +----A--B-\----A--B--.
    |..........\......./..
    |...........\\...//...
    |.............---.....


    If I sample at A I get back the original waveform, but if I sample at B, I get back one of only half the amplitude, when I really wanted a full waveform that was phase shifted -pi/4.
  • Good recommendation :)

    There's a good analogy for the dynamic range of CDs. It is computer monitor displays. People who are content with 16 bits in their audio because it covers a range of silent to really loud should _also_ be content to always use 'thousands' of colors on their monitors- because that covers the same range of black to white as 'millions' does.

    Of course, people will not be content with this, because 'thousands' produces subtle banding and a perceptible degradation of the picture in ways that are clearly understood. In the same way, 16 bit digital audio produces a thinning and drying of the sonic texture that is the audio equivalent of banding- for the same reason, which is linear dynamic encoding of an analog source that will virtually never be _exactly_ equal to the 16 bit truncation of its resolution.

    20 or 24 bit is near as dammit: it's way harder to notice any adverse effects. The Sony format is particularly interesting as it has the potential, sometime in the future, of being handled without recourse to PCM encoding (though, distressingly, it seems that the current incarnations do make use of such a stage- hopefully a really impressive one like 128K cutoff and 32 bit or something equally flash). What that would mean is that there would be no specific bit depth it would be comparable to- saying '8 bit' or '16 bit' or '24 bit' w.r.t PCM is saying 'Here is the maximum signal displacement, from -Xv to Xv. I will now chop the area between into equal parts and quantize whatever the _real_ voltage is to the nearest level I can encode.' In this sense, PCM is never accurate at all- it's very unlikely that at a given moment, the voltage really exactly matches the encoding, because the encoding might be 1.0256 volts and the real voltage was 1.02562854647823862349823474634672348 volts.

    Again, it's the same issue as monitor color trueness at 'thousands' of colors. The ironic thing is, with well recorded music the hottest peaks are _really_ hot: 99.9% of a piece of music might be less than half the available dynamic area used for encoding. The encoding is linear, so that is _wasting_ half the bits. A sort of Gaussian distribution might have been a better idea, but it's too late to worry about that now :)

    Of course if you make music horrible enough sonically with brutal overcompression, most of it will take up the full dynamic range of the format. It's a pity that this sounds atrocious, as it's the only way to really make use of linear encoding :)

  • A good example of this at the other end of the audio frequency spectrum is that you really can't hear notes much lower than 45Hz...however you sure can feel them if they are loud enough; anybody who has woke up to a, "ghetto-blaster," boom-b-boom-b-boom driving through the night knows what I mean. It sort of works in the same way for ultra-high frequencies: the various frequencies bounce around in your ear-canal and create far-out physical harmonies.

    -AP
  • Read some stereo review magazines- in particular, the sort that throw a lot of test equipment at the problem. They _routinely_ graph the frequency response of amplifiers from fractional Hz to well beyond a megahertz.

    Among other random terms of abuse I spotted the interesting claim that no power amplifier ever passes higher than 20K as it would hurt the speakers and damage the output stage of the amplifier. Apart from directing attention to even hoary old warhorses of the audio mag trade like Stereo Review, I think I had better let this claim pass without further comment :)

  • You are both correct.
  • 16 bits isn't enough.
    Bullshit!
    44.1K isn't enough either.
    Bullshite!

    If I were to sneak into a recording studio and insert a 20 kHz low-pass filter and inject white noise at -90dB, would anyone notice?

    No.

    And that's the sort of thing you should think about. Not mythical triangle and square waves (which instruments produce those?) but what's the bandwidth and the S/N ratio of the whole system? From mikes to master tapes to mixers to your home amps and speakers. All this put end-to-end, what is the bandwidth and S/N ratio? That's all. Once you have that, you know what is the lowest sampling frequency you can use, and the lowest resolution you can use.

    Hint: the system has a bandwidth of less than 20 kHz and an S/N ratio of less than 70 dB.

  • It's CD AUDIO. You don't get to pick the lowpass frequency. It has to be 22.05K.

    I don't understand how you manage to turn the idea of a 14700 sine into a 918.75Hz tone, but so much the worse for you: the 14700 (ironically) does have the second harmonic obliterated by the 22.05K lowpass. When you start talking about 918 hz tones you shoot yourself in the foot- you don't _get_ a 919 hz filter in CD audio, and the modulations are going to still be there. They won't be as nasty as a stepped sampled wave but you've got quite a few harmonics that will remain after lowpass filtering at 22.05K (even the compromised realworld version). Some of the modulation you see _will_ survive this.

    It's flatly incredible how many armchair audio theoreticians won't even go as far as you have, insightfully, gone, in seeing that the results of the sampling process are modulated. Phase differences do matter, but even if they did not, the degree of modulation of the sampled wave is mathematically provable unless you get to do a _specific_ lowpass filter on it. If you count obvious changes in phase, the amount of change is provable even when you do ideal lowpass filtering on it.

    Yes, math is grand ;)

  • I thought DVD Audio had been put on hold while they worked on an "improved" copy protection scheme.
  • Well, MP3's sound fine to me. I listen to variable bitrate ones that I rip myself, usually, with max bitrate at around 320.

    And I can't tell the difference between that and the CD on my $3000 sound system...

    On 128kbps ones, sure. But 192 and higher, most people really _can't_ tell.
    The only people who say they can tell are, as far as I see it, aloof audiophiles who are scared shitless that they aren't really getting their money's worth. These are the same people that cling to vinyl and say it's better than a CD.

    Sorry, no. Sure, it might theoretically have a higher frequency response, but the human ear can only hear from 20 to 20000 Hz, which requires precisely CD quality sound. CD also allows for a little lower than that for big ol' subs.

    So there's no point in going up to 100kHz audible range, because NOBODY EXCEPT MY CAT AND THE NEIGHBOUR'S YAPPING LITTLE DOG are going to get any advantage from it.

    And my cat doesn't like Paul van Dyk anyways.
  • The reason that people tend to think that mp3 is 'cd quality' is due to two things.

    1) The majority of mp3 afficiandos are listening to pop music (or former pop music). To get the effect of this music, a really high quality setup is not required (may times because they have only listened to the CDs on crappy headphones or a mediocre stereo). So to them, the mp3 sounds perhaps slightly different, but not really any worse, and completely listenable. I know the first time I heard Pink Floyd: The Wall on a *nice* stereo (hand built & tuned amp, nice speakers..).. I was amazed. I could not *BELIEVE* what I heard.

    2) people use 'cd quality' to mean 'acceptable quality for me to listen to without pops and hisses'. You are entirely correct. They don't mean it's the same quality is a CD recording; they mean they couldn't care if it is or not.

    And if any of them really loved symphony... you'd see that difference right away.
  • Regarding 'hardly any loss to whom'- it would be more accurate to ask, throwing away >20K is hardly any loss to _what_?
    • Human voice sibilance is maybe 6% over 20K.
    • A cymbal crash is as much as _40%_ over 20K.
    • Keys jingling is _60%_ over 20K.
    I (big surprise ;) ) agree with Paul here: I am surprised that people will claim that throwing away more than half the sound information from a sound will be hardly any loss! This is not some piddling little 1% frequency content we're talking about here- plenty of very common sounds both in real life and in music have as much over-20K content as under, and some are _mostly_ over 20K. That's scientifically proven, and an interesting study it was too- thanks to the slashdotter who posted it :)
  • Note: the harmonic distortion content to the 918.75hz sine wave is not going to be continuous. It is going to be rapidly cyclical as it's the product of interaction with the sampling rate- each harmonic's strength will oscillate between zero and whatever the maximum is (a fraction of a percent?) depending on the phase of the tone relative to the sampling rate. This effect will be considerably more obvious at higher frequencies but will still be present at 918hz. The harmonic distortion is never a continuous amount, but is invariably a rapid cycling between zero and the maximum amount.
  • The point was that as long as ones end up as ones and zeros end up as zeros on the other side of the cable, there is no need for a better cable. The example of cd-pickups and cabling is excellent. There is absolutely no point in putting a better cable or a better pickup-mechanism into a cd-player if the ordinary one achieves bit-error-rates which are uncritical to the error-correction. C't, a german computer and technology magazine, tested how well audio cds are read by various CD-ROM drives. The drives which were not susceptible to frame jitter read the discs flawlessly, meaning 0 errors. That would not be improved by throwing more money in the form of more expensive cables at it. A perfect result can not be improved. It is this fact which some audiophiles fail to acknowledge because they are so used to their world, where there is no such thing as a perfect result.
  • For reference:

    Site A [euphoria.org]
    Site B: http://www.eeap.aston.ac.uk/
    teltec/tutorials/Digital%20Baseband%20Transmission /Slides/nyquist%20theorem.htm
    Site C [aol.com]

    Ok, I thought I had already explained the problem with Site A but here it goes...

    First see where he says "the sampling rate here is below the Nyquist frequency" with respect to the first picture? He is wrong. Count the samples from left to right. There are seven samples in the first three periods of the sine wave.

    Let's say that each vertical line is 1us. The sampling frequency is (1/10^(-6)) 1MHz. The frequency of the sine wave is (3/(7*10^(-6))) 428.571kHz.

    Ok now that you realize he has no idea what he's talking about, look at his bottom picture. He actually thinks that the waveform produced after reconstruction is that jagged edged horrid looking thing. That waveform has tons of high frequency components. If he would understand that #1 those samples are not connect the dots. And #2 whatever waveform you reproduce must be low-pass filtered so those jagged edges should never be there. You'll find that the result is exactly the same sine wave you sampled.

    Next he explains the waveform he produced as "aliasing" which again he is completely wrong. Aliasing is the result of a signal being sampled at too low a frequency but it has NOTHING to do with his web page. If you look at the corrrected URL for Site B (I added a reply to my post because /. added an extra space because the URL was too long) you can see where the bell-shaped things are overlapping. That is aliasing. Did you actually believe what you read about Site A after my first comment?

    Now to Site C! First of all, Nyquists theorem states that the sampling frequency must be greater than the highest frequency in the sample (look it up!). So your argument about the samples being exactly on all the zero-crossings is irrelevant and incorrect.

    Now in the actual site, that first picture is misleading because the waveform on the left is not a sinusoid. It is just something he hacked together in a paint program. It doesn't have the right slopes. The waveform on the right does look like a sinusoid. Now where I said this guy was thinking in the time-domain instead of the frequency-domain... when he said "sampling long enough doesn't work for real live video" he is misunderstanding that that even the most complex waveform can be made periodic by simply stating that the period is infinitly long. It's called Fourier Transform (as opposed to Fourier Series). You resultantly get infinite frequencies but if you cutoff, you are getting virtually the exactly same waveform since the high frequency components are so small.

    His "good news" section is even worse. He draws pictures of the sampled waveform in two ways. In the middle he connects the dots and makes the same error as Site A. He doesn't low-pass filter the waveform. Oh, maybe I should explain what low-pass filtering is... if you go to Site B, low-pass filtering throws away all those bell shaped things except the one centered at 0, the middle one. If he low-pass filtered the middle or right waveform he would get EXACTLY the same waveform as the one on the left. There was a nother poster somewhere in this thread that actually did exactly what we are talking about by sampling a waveform, reproducing it with "sample-and-hold" and then low-pass filtering. You'll be surprised at the results he got (I was not).

    The more samples/cycle means it is easier to reproduce the waveform they originally sampled because it is easier to design a low-pass filter when the (again go to Site B) space between those bell-shaped things is larger.

  • Here's an additional point on your side- cables have a quality known as capacitance (also inductance but let's focus on the capacitance). Using a bad cable with your high powered amplifiers and speakers could lead to a situation where the cable is interacting with the tweeters and producing a dip in the impedance in the high frequencies. The result is acutely unpleasant grating highs as the amplifier's presented with a nasty impedance dip, possibly in a range where it's at risk of oscillating.

    Cables are _hugely_ important. It's not because of magic- it's because of capacitance, resistance, inductance and the way amplifiers interact with these qualities. You could easily have an amplifier and speaker combination where, with one set of cables, all was well, but with another set the amplifier would oscillate to feedback blowing up the amp! That is about as real as you could ask for. Normal cables are less prone to be _that_ bad, but there's still a great deal of difference in different cables. Again, it's not primarily the cable itself but the way it interacts with the transducer and especially the amplifier.

    I think in order to understand this properly, you'd kind of have to have enough technical background to know that speakers' impedance drops sharply at the driver's resonance... without granting some facts like that, it's impossible to even begin to explain to otherwise smart people what's happening. 'But I put a voltage through and it was there on the other side of the wire!' is not evidence that a wire will perform under demanding dynamic conditions- or hooked up to a borderline-unstable amplifier. And competitive high end car audio is very much about borderline-unstable amplifiers due to the very low impedances :) interestingly, stadium sound reinforcement uses similar principles. You might have a stadium-sized PA with an amplifier that puts out one watt... and God knows how many amps, through a speaker network that is 0.00000000001 ohms. The rules change when you start to deal with jobs that big...

  • by anonymous cowerd ( 73221 ) on Sunday October 15, 2000 @08:54AM (#705354) Homepage

    So here's the ugly truth. The MP3 revolution seems to have proven that most people have tin ears...

    No, what the MP3 revolution shows is that almost all music fans value the content more than the technical reproduction. In the same way I'd prefer a worn paperback Nabokov to a shiny-new hardbound Stephen King, I'd rather, by a factor of a thousand at least, listen to a scratchy cassette copy of "Tim" on my Walkman than the latest Brittney Spears (sp.?) blasting out through the best stereo system you've ever seen or heard in your life. If you're not a Replacements fan, or if you are, God help you, a B.S. fan, go ahead and substitute in the above sentence the names of your faves ad lib.

    Yours WDK - WKiernan@concentric.net

    You're my favorite thing!

  • A square wave of 22050 can NOT be reproduced by even the most ideal audio equipment.

    In fact, all the energy in the cosmos can't reproduce any square wave perfectly, because the leading edge of an ideal square wave is perfectly vertical, requiring an infinite acceleration at the speaker cone.

    Fussily yours WDK - WKiernan@concentric.net

  • Let me ask a question from a coder's standpoint: how do I write code to perform operations on a high-frequency 1-bit sample? I can write code to lowpass, highpass, bass boost, mix, compress... 16-bit audio easily. But what happens when I have to write my .S3M player to do these effects on a 1-bit sample? Even if the format is technically superior, I think it may be too hard for the amateur to work with for it to be useful.
  • Maybe you should read that DSP book (by the way Oppenheim & Willsky is not a DSP book, it is a simple signals/systems book... a very common undergraduate electrical engineering textbook). Ok, this may be counter-intuitive because you obviously have had no formal education in signal processing but that triangle wave you drew has actually has many, many frequencies higher than the frequency that you are talking about. How can that be? If you low-pass filter the triangle wave (basically it means smoothing out all the edges) what do get? You get a sine wave exactly like the one you first drew. I suggest you learn a little more about fourier analysis before posting about things like this.
  • by jbf ( 30261 ) on Saturday October 14, 2000 @07:48PM (#705359)
    More frequency range isn't going to be recorded, played, or heard by anyone.

    First of all, things above 22kHz aren't picked up by ordinary mics... Even the ultra-high-end Neumann U87Ai only claims 20-20kHz frequency response (http://www.neumann.com/mics/u87ai.htm)

    Secondly, most speakers won't crank out those high frequencies without a severe falloff in response: the high-end Genelec 1038A triamped monitor gets you 33-20k Hz (-3dB). (http://www.genelec.com/products/1038a/1038a.htm)

    Finally, most people can't hear above 20kHz, especially those people who are incessantly blasting their ears out with loud music.

    The best reason for Super CD (or DVD or whatever) is higher bit depth, NOT higher sampling rate; going from 16/44.1 (CD quality) to 24/44.1 takes just 50% more space, for nontrivially better quality, while going from 16/44.1 to 16/88.2 brings minimal benefit at a 100% space penalty.
  • Nyquist's theorem does not imply, however, that the representation of the maximum [or near maximum] frequencies will be highly accurate as far as the shape of the wave form is concerned.

    That's incorrect. If the signal is sampled at twice its highest frenquency, the signal can be reconstructed exactly. This assumes that the samples are recorded precisely without quantization, and that the signal is truly bandlimited.

    This is why higher sampling frequencies ARE relevent to higher audio fidelity. Higher bit resolutions are arguable though...

    No. Higher sampling frequencies allow you to get away with fewer number of bits per sample, and this usually simplifies the electronics. e.g. With delta sigma modulation, the signal is sampled with 1 bit per sample at a very high sampling rate. The bit sample essentially encodes the change between successive samples., i.e. an increase or decrease, and if the sampling rate is high enough, the original signal can be reconstructed from this information fairly accurately.

  • Whatever you may think of Sony's conduct in general, this particular product is entirely consistent with the open-source philosophy that we all cherish. From the article:

    The company was faced with archiving some 300,000 pre-1960 analog recordings. Unlike fine wine, analog does not age well with time; after about thirty years, recording tape becomes brittle and disintegrates. Preserving these analog masters by creating digital copies is of paramount importance. But the limitations of digital sound proved a hindrance.

    Sony had an itch (archiving their own recordings), scratched it, and produced a valuable product as a side-effect. We should support them for it.
  • It's not about preserving audible signal, it's about being able to design antialiasing filters that don't chop off audible frequencies and introduce a lot of group delay/phase problems in order to eliminate all frequencies above 1/2 the sampling rate. Higher sampling rate = maneuvering room.

    The marketing folks, however, think the general public would rather hear about preserving those ultra-cool bitchen super high frequencies. And they are right.
  • Actually, DVD Audio discs won't play on existing DVD Video players. DVD-A was slated to use a varient of CSS ("CSS-2") for copy protection, but DeCSS put the kibosh on that. DVD-Audio as now shipping uses a new and supposedly improved encryption scheme (but still nonstandard and closed, so it's probably crackable). Think RIAA's having a fit now that we can make perfect rips of 44.1k/16-bit CDs? Hah! That's nothing :)

    Anyway, most DVD-Audio players coming out now are also DVD-Video (thank god), but older DVD-V (and computer DVD-ROM drives) won't be able to read DVD-A discs.
  • Okay okay....

    When we say "You can reproduce frequencies of up to 22050 Hz", we are talking about sine waves. The "square wave of 22050 Hz" that you are talking about is pretty much irrelevant. A 22050 square wave contains frequencies that are *much* higher than 22kHz. Most of these frequencies would be filtered out prior to any kind of processing (digital or analog).

    So in a sense you're confusing the discussion. A square wave, in terms of the signal processing we're talking about, is not a pure "wave" any more than is the sound of someone coughing up a lung. You must first decompose the sound into its component frequencies before you can discuss signal processing and make any sense.

    Trying to think in terms of a square wave, at all, puts you in the wrong position to think about this (unless you're talking about the Haar wavelet but that is a different subject entirely).
  • by knarf ( 34928 ) on Saturday October 14, 2000 @08:06PM (#705382)
    Well, another day, another media format. Of course, the media companies will happily sell me their products. But I already have Radiohead's 'OK Computer' on CD, so I already paid the license fees. I want to 'upgrade' that CD to the format-du-jour, and am willing to pay the production costs and a little something to make it worthwile for the industry to keep on developing new products. I do NOT want to pay royalties again, since I already did. And since I have always been told that those compact discs are so expensive because of the license fees, this upgrade should be quite cheap, am I right? I mean, I only OWN the piece of plastic, which is cheap. It is the license fee which drives up the price (or so 'they' say). So, just let me upgrade my piece of plastic then...

    No, unfortunately I am wrong. But I should be right...
  • In theory......

    Truth is, one of the unfortunate byproducts of digital sampling is something called 'aliasing' of frequencies above fN. Think this: a sound byte is sampled at 44100 Hz. The Nyquist is 22050 Hz. A 21000 Hz sine wave will be reproduced alright, but another 24000 Hz sin wave (22050 + 150 Hz) will show up as 21000 (22050 - 150 Hz) in the digital sample upon playback. So when music is digitized, some fancy low-pass filters are required on the input to prevent this aliasing effect.

    Now, one of the problems with filters is that 'perfect' filters are somewhat imposible. Even good low-pass filters which must exhibit near -20dB at fN, roll-off about -3dB at fN-10%.

    So the whole point of this is that frequencies near the Nyquist must have already been attenuated or discolorization of frequencies above 12kHz or so will occur.

    That's why I swear by good vinyl. A good needle/cartridge and amp can give impressive 30kHz results! Ironically, most people can't really hear this anyway, but most audiophiles will agree it makes for a crisp,full sound that sounds dynamic, as opposed to digital, which sounds cold or flat in comparison. Not that records have the best signal/noise ratio stats ;-)
    --

  • In short, no. Nothign ends up that way.
    Others make MD players. The one thing that kept MD players from saturating the market was price... too expensive at first. More reasonable now, but not flexible enough given the current age. (I have a player, but never use it; I don't like not having direct access to copy my MDs)

    HOw can one blame sony for this? Did they *hurt* you by inventing a new technology and then making it expensive? No.. you didn't have it in the first place.

    I mean, I hate corporatism (to coin katz) (I can't believe I did that) as much as anyone, but Sony is just not one of the companies that I think of as 'evil'.
    I think some of their stuff is too expensive.. but that's their loss.

    I'm amazed how so many Americans find the fact that Napster is being sued to outrageous... hell, you Americans sue each other like there's no tomorrow!

    What do you mean 'once again reinventing'? How is this bad? So they come out with a new CD format that's proprietary and must be licensed from sony.. how is this going to hurt you?
  • Get yourself a pair of really good speakers and amplifier and *then* worry about the quality of your CD player.

    EXACTLY.

    Then again, if you are even *considering* purchasing a $1500 CD player (SACD player) you better have some damn good speakers and an amplifier :-)

    Spyky
  • by nathanh ( 1214 ) on Saturday October 14, 2000 @08:16PM (#705396) Homepage
    Nyquist's theorem states that the highest frequency that can be represented is one half the sampling rate. This is obvious because you must be able to detect at least a peak and a valley of the sound wave.

    Entirely correct.

    Nyquist's theorem does not imply, however, that the representation of the maximum [or near maximum] frequencies will be highly accurate as far as the shape of the wave form is concerned. At and around 1/2 sampling freqency, the wave forms become basically nothing but square waves [alternating between a single high, and a single low point]. In order to deal with this, some sound decoders will attempt to interpolate the waves, but they cannot reproduce the original sound accurately. This is why higher sampling frequencies ARE relevent to higher audio fidelity. Higher bit resolutions are arguable though..

    You fail! This idea that the signal is not perfectly represented just because you have only two sample points is complete nonsense. Only two sample points are needed because you know the encoded signal must have been low-pass filtered at half the sampling rate before sampling (otherwise you would have introduced aliasing errors). Given this information you can entirely reproduce the original signal as it was before sampling. Nyquist's theorem states that you can exactly reproduce the signal if sampled at twice the signal's maximum frequency. I quote Oppenheim and Willsky:

    The sampling theorem establishes the fact that a bandlimited signal is uniquely represented by its samples.

    In layman's terms: you don't need more bits to reproduce the original signal. You just need a perfect low-pass filter on your output and infinite precision on your PCM samples. A sine wave with sampling points at the exact peaks and troughs will produce a square wave of the same frequency after sampling/modulation. This square wave will contain the frequency you want plus odd harmonics. The harmonics are naturally going to be higher frequencies and so they will be removed by an appropriately picked low-pass filter. And what's the appropriate cut-off frequency for your low-pass filter? 1/2 the sampling rate, of course. The result is the original sine wave.

    Now in practise they actually do sample at higher than the low-pass cut-off frequency, but this is because of other limitations. The PCM samples are only 16-bit, not infinite precision. Also there is no such thing as an ideal low-pass filter: realistic (and affordable) filters will take several kHz to drop from 0dB to -9dB. Also you need exactly -/2 phase difference between your sampling pulse train and the source signal. There are also aliasing issues but at this point the discussion gets heavily into mathematics.

    Higher resolution is what is actually needed but this is expensive to achieve. Increasing the sampling rate is far more practical (considering how fast CPUs are) and a heck of a lot cheaper. This is the real reason DVD audio samples at 96kHz. It's not because you can hear 48kHz tones but because it lets the DVD manufacturers use cheap DACs and cheap low-pass filters without sacrificing fidelity.

  • by nathanh ( 1214 ) on Saturday October 14, 2000 @08:59PM (#705409) Homepage
    Truth is, one of the unfortunate byproducts of digital sampling is something called 'aliasing' of frequencies above fN. Think this: a sound byte is sampled at 44100 Hz. The Nyquist is 22050 Hz. A 21000 Hz sine wave will be reproduced alright, but another 24000 Hz sin wave (22050 + 150 Hz) will show up as 21000 (22050 - 150 Hz) in the digital sample upon playback.

    Wrong. It will alias to 24000 - 22050 = 1950Hz.

    So when music is digitized, some fancy low-pass filters are required on the input to prevent this aliasing effect.

    This is correct. Low-pass filters are used before sampling to prevent aliasing.

    Now, one of the problems with filters is that 'perfect' filters are somewhat imposible. Even good low-pass filters which must exhibit near -20dB at fN, roll-off about -3dB at fN-10%.

    Also correct.

    So the whole point of this is that frequencies near the Nyquist must have already been attenuated or discolorization of frequencies above 12kHz or so will occur.

    And suddenly wrong again. You have a source signal x(t) which is comprised of nothing but sine waves. This is Fourier's discovery. You low-pass the signal at 20kHz. You now only have sine waves with frequencies under 20kHz.

    The discolouration occurs only above 20kHz. But, we know that most people can't hear those frequencies anyway, and certainly most people's speakers have trouble doing better than -3dB at anything over 18kHz anyway. So it's hardly any loss at all to throw away the sine waves over 20kHz. You Won't Hear Them Anyway.

    That's why I swear by good vinyl. A good needle/cartridge and amp can give impressive 30kHz results! Ironically, most people can't really hear this anyway, but most audiophiles will agree it makes for a crisp,full sound that sounds dynamic, as opposed to digital, which sounds cold or flat in comparison. Not that records have the best signal/noise ratio stats ;-)

    Most audiophiles are full of shit. The weight of a physical needle restricts it's dynamics to well under 18kHz and the granularity of vinyl produces a maximum SNR resolution of 70dB. Laser tracking record players on platinum originals I'll wager would be better than CD, but people who prefer vinyl are fooling themselves.

  • by bbrantley ( 168290 ) on Saturday October 14, 2000 @09:06PM (#705411)
    First of all, things above 22kHz aren't picked up by ordinary mics... Even the ultra-high-end Neumann U87Ai only claims 20-20kHz frequency response (http://www.neumann.com/mics/u87ai.htm)

    Far from true. The mikes used in this paper, "There's life above 20 KHz!" [caltech.edu], certainly were capable of this.

    Secondly, most speakers won't crank out those high frequencies without a severe falloff in response: the high-end Genelec 1038A triamped monitor gets you 33-20k Hz (-3dB). (http://www.genelec.com/products/1038a/1038a.htm)

    Also not true. Unless there is a low-pass filter to prevent sending higher-frequency signal to the tweeters, most amplifiers, speaker wire, and drivers will gladly play sounds upwards of 100KHz. Whether it is necessarily FLAT is another story, as most people don't optimize (or even measure) flatness above 20k.

    The best reason for Super CD (or DVD or whatever) is higher bit depth, NOT higher sampling rate; going from 16/44.1 (CD quality) to 24/44.1 takes just 50% more space, for nontrivially better quality, while going from 16/44.1 to 16/88.2 brings minimal benefit at a 100% space penalty.

    This is probably true, except that "minimal" may be too harsh a term. Have YOU ever done a careful comparison between a 16/44.1 recording and a 16/88.2 recording? (I have!) On a somewhat-related note, it is remarkably interesting what effect a more accurate clock signal has on the quality of a 44.1KHz recording. The human ear can distinguish playback when the timing of these samples being played back varies by as little as 10^-10 seconds!

    The reality is that the human ear's ability to differentiate is remarkably more subtle and complex than the market (and marketeers) would have you believe.

  • by mcg1969 ( 237263 ) on Saturday October 14, 2000 @09:16PM (#705414)
    The SACD has a sampling rate of 2.82 MHz. This means that theoretically you could accurately (AD/DA converters and such aside) reproduce the frequencies up to 1.41 MHz (1410000 Hz).

    Sorry, SACD is a lot more complex than a simple application of Nyquist can handle. The key to SACD's high fidelity is all in the quantization theory.

    Yes, an SACD has a sample rate of 2.82MHz, but that's with one bit per sample (per channel). Yep, that's right---a single bit per sample. In fact, the signal-to-noise ratio on a SACD is very likely negative--there is more noise than signal.

    Now before you blow your top with how absurd that sounds, let's clarify one thing: the SACD format jumps through serious technical hoops to insure that the vast majority of that noise is in the completely inaudible range. And, the vast majority of the signal is, of course, within the audible range. The technique is, not surprisingly, called "noise shaping".

    So once you limit your measurements to, say, 0-20kHz, you're back to where you would hope: the astronomical dynamic range and signal-to-noise ratio of a high-fidelity audio format. (In fact, SACD is designed to provide ultra-low noise and 120dB of dynamic range all the way out to 100kHz, from what I understand.)

    For those of you who remember, or perhaps own, CD players with "1-bit D/A"s, you're using a similar version of this technology. The difference is that the SACD recording process can decide at the mastering stage how to get down to 1 bit per sample, and that's a much better place to make that decision.

  • by Chris Johnson ( 580 ) on Saturday October 14, 2000 @09:34PM (#705431) Homepage Journal
    I _hope_ this becomes common technology- there are some extraordinarily important things about it. A little background:
    • 16 bits isn't enough. That's _really_ obvious at this point- no professional works in 16 bits except for the final CD output. Mix busses have to be many times that in the digital domain, but even if you mix with ideal noiseless coloration-less electronics there's a really big difference between monitoring an undigitised feed of the signal with monitoring the 16 bit output.
    • 44.1K isn't enough either. This is not primarily due to people being able to hear beyond 20K (though you can sense such sounds to some extent- why do you think smashing glass or dropped plates make you jump? Viciously loud supersonic transients), it's due to the brick-wall filters required. High end amplifier designers go to great lengths to get their pass-bands up into the megahertz (and nobody claims humans hear that!) because cutting off lower causes interactions across the entire frequency band. Cutting off at 22K is just ridiculous.
    Now, how does the Sony approach compare? The neat thing about the bit rate is that it's effectively infinite bit rate- it's not a finite set of voltage levels but just one bit very fast tracing a voltage level that could be anywhere. This is substantially beyond even 24 bit- a major, major advance. That's gonna be very noticable.

    As for frequency, there is a surprise in store here. It may or may not be competitive with advanced PCM encoding at say 96K- but two very, very important points:

    • There doesn't need to be _any_ brickwall filter on the output- provided a circuit can be made to output this stuff that doesn't merely calculate it as a super-PCM-encoding and D/A converter. If the format can feed a sort of very high frequency analog synthesiser, no filter is needed- which is critical, because...
    • ...the potential slew rate of this technology is just astronomical. I hope the power supply of the players is up to it- if not there will be some very effective power supply mods waiting to be done, such as backing up the power supply with MIT Multicaps (a film cap that can produce very very high instantaneous voltage). Basically, if you fed this technology a big square wave, it might not be able to turn the corners of the wave instantly, but the vertical parts of the wave would be _vertical_- no brick-wall-filtered system can get anywhere close to this.
    We're talking absurdly high transient peak voltages here: this is why high end audiophiles use absurdly heavy cables and absurdly powerful amplifiers, to let those peaks through. It doesn't hurt the speakers: this isn't RMS or even 'peak' wattage, the spikes are of such short duration that you can feed speakers many times the maximum 'peak' voltage if it's only for a microsecond, and high end systems do just that.

    Where do you find such peaks? Easy- The Who ;) seriously, The Who is a _good_ example, but symphony orchestras are also good for this. The capacity for this type of extreme and essentially 'inaudible' (too brief!) transient translates to the ability to produce the _sensation_ of loudness- for instance, you could easily make many systems play 'Live At Leeds' and sound loud and bright and kind of grating and ear-splitting, but with this technology it would be less grating but more _electrifying_ and the impact would be like having the living people right there playing at you, not just a bunch of very loud sounds. Alternately, you could play big orchestra crescendos and the resulting sound would be _huge_, not just loud but as big as a live performance.

    It's really not hard to make stuff sound 'loud', but making it _feel_ loud is something else. If you don't have that, the loudness ends up being just a grating, thin surface, which is actually a very good description of the sound of most pop recordings these days :) the irony is that this technology is coming around just when the recording industry's pushing sounds that are substantially worse than even CD audio can produce...

    Bottom line: I want one. Specifically, I want this to _master_ to. I have quite a bit of stuff that loses about 2/3 of its potential when made into 44/16 (eight tracks of 48/20 output analog and mixed with passive resistance mixing will tend to do that- I once figured the rough equivalent resolution was about a 64 bit mix bus, possibly higher) Maybe I should try to wheedle Sony out of a recorder ;)

  • by nathanh ( 1214 ) on Saturday October 14, 2000 @08:49PM (#705435) Homepage
    I think the misunderstanding is on your part.

    No, the previous poster is entirely correct.

    A square wave of 22050 can NOT be reproduced by even the most ideal audio equipment.

    Sure. This is because a square wave at 22050 Hz contains frequencies above 22050 Hz.

    A square wave contains a single sine wave at the square wave frequency and many harmonics at integer multiples of that frequency. When you sample a square wave you're actually sampling AN INFINITE number of sine waves. These harmonics are well beyond the ability of your ear to hear.

    Now at least some of the harmonics of a square wave will have a frequency higher than half your sampling rate. So if you naively sampled a 22050 Hz square wave at 44100 Hz, of course you would corrupt the signal. This is why a recording studio will low-pass the source before sampling at approximately 22050 Hz. It removes the harmonics you'd never hear anyway.

    Close to the Nyquist limit you're basically limited to reproducing sinewaves, however there is no quarantee that the original waveform was a sinewave.

    All signals are composed of sine waves. Nyquist says nothing about square waves because a square wave is just a collection of sine waves. The Nyquist theorem lets you reproduce sine waves perfectly up to half the sampling rate. This nonsense about "but I applied a square wave and Nyquist was wrong" is totally stupid. What you've actually done is sampled several sine waves, most of them at frequencies well above half the sampling rate, then found that they got badly aliased.

    Saying this is means you're "limited to reproducing sine waves" is just nonsense. There is no other signal except a sine wave. The Nyqust limit is entirely correct.

    This is why higher sampling rates (and I mean substantially higher, not lame-ass 96kHz) is important for accurate reproduction of the original signal

    For accurate reproduction, yes. But you won't be able to hear that accurate reproduction!

    Let's get something straight: when it's said people can hear up to 22kHz tones, that is a sine wave at 22kHz. If you try and listen to a 22kHz square wave then you will ONLY HEAR the first harmonic at 22kHz. All the other harmonics will be beyond your range of hearing.

    There is no need to sample higher than 96kHz for audio purposes. There's no real need to sample much higher than 50kHz, despite some of the absolute nonsense that people claim. The standard of sampling at 96kHz is absolute over kill for human ears.

  • by Andy Dodd ( 701 ) <atd7NO@SPAMcornell.edu> on Saturday October 14, 2000 @10:00PM (#705455) Homepage
    You can hear the difference between 2-channel and 5.1 surround. You can hear a BIG difference.

    But you can mathematically prove that you can't hear the difference between 44.1 KHz/16 bits and something "better". (Note, as mentioned before, there are some slight exceptions... The mathematics that say 44.1 KHz is enough assume a perfect low-pass filter with no phase-shifting and an infinite slope dropoff. Using a somewhat higher sampling rate allows you to use a less-perfect LPF.) But above 96 KHz, you need not bother... 96 gives PLENTY of headroom for the filter designers.

    So this SACD format is going to die unless Sony pushes it with HEAVY marketing. But DVD-A has more people backing it with cheaper (but better!) equipment.

To write good code is a worthy challenge, and a source of civilized delight. -- stolen and paraphrased from William Safire

Working...