Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Music Media

"Fingerprinting" of Audio Files? 127

Pseudonymous Coward writes: "This could be interesting: 'Tuneprint is an audio fingerprinting algorithm. It takes the unique 'fingerprint' of a sound clip, which can then be compared to a fingerprint database to get more information about the clip, like title and artist, lyrics, URLs, related music, copyright status, or almost anything else. The fingerprint doesn't change even if the sound is compressed, converted to a different file format, broadcast over the radio, and so on.'"
This discussion has been archived. No new comments can be posted.

"Fingerprinting" of Audio Files?

Comments Filter:
  • by Webmonger ( 24302 ) on Monday August 28, 2000 @01:36AM (#823597) Homepage

    FAQ:

    Fingerprints in the abstract are fundamentally more secure: a properly constructed fingerprint can't be broken without scrambling the audio file, while a sufficiently smart and well-funded adversary can always break a watermark, given enough time.

    If watermarks are steganography, fingerprints are more like hashes or CRCs. If you have a perfect fingerprint, the fingerprint being separate from the song, you'd have to make the song not sound like itself in order to stop it from being recognized.

    Of course, we have yet to see how good Tuneprint is, but it sounds pretty cool. And it wouldn't be hard to build up a database with a bunch of CDs and CDDB.

  • by gschmidt ( 18105 ) on Monday August 28, 2000 @03:37AM (#823598)

    Hiya. My name is Geoff and Tuneprint is my baby which some excellent and astonishing friends at MIT are helping me deliver.

    I'd already been up all night when the story was posted at 7am. I'm going to try to stumble my way through a few points, get some breakfast, and try to answer people's questions as soon as I can get to it.

    First of all, this is not a hoax. Wow, hair triggers :) Yeah, I was sleep deprived whilst writing most of the website. Yeah, the barcode in the logo is '31337 24816'.. get it.. eleet powers of two. eleet two-to-the-n's. eleet two-n's. eleet tunes. yeah. well. you had to be there. and jamie's to blame for the 24816 pun :) Don't hold it against us that we're not suits.

    The general idea is pretty simple. We take the input audio. We condition it (adjust it to a known sampling rate and volume.) We pass it through the psychoacoustic model (it's about a notch more complicated than what you'd see in a mp3 encoder, which ain't saying much. This is all stuff that was mostly hashed out decades ago.) This model effectively strips the parts of the sound you can't hear -- the desired result being that even if the audio has been compressed or manipulated subaudibly, the result is still the same. Okay, so the net result of all of this is a vector that covers a very small segment (fraction of a second) of audio. We stack several of these vectors (possibly separated in time by a bit) side-by-side to get a big vector. Then we do completely boring and standard and well-understood statistical and pattern-matching stuff on the vector to make it smaller and more palatable for the server -- think of it as lossy compression. Then it goes off to the server. The server is about equal in complexity to a text search engine. (I say this fully realizing that I have only a vague impression how Google works. It's certainly a lot more complicated than the obvious hash-table-of-sorted-lists stuff.) It finds the database vector that's the best match in a fairly boring but efficient way. (No, it does not involve searching through all tracks one by one, no more than Altavista searches through all web pages one by one every time you want to find some porn.) Call the result a submatch. Back at the client, the whole process is repeated a bunch more times, generating a stream of submatches ("Radiohead offset 0.. Radiohead offset 1024 or 16384.. Slashdot's Gr34test Hits 5262324.. Radiohead offset 3072..") from the input audio stream. Then, the client looks at the submatches and tries to figure out what the input audio was and where the song boundaries are (did somebody really stick in a sample from Slashdot's Gr34test Hits, or was that just an unlucky match?)

    See? Not magic. It's a challenging problem, but not an impossible problem. The reason that this doesn't exist right now is not that generations of scientists have tried and failed, but rather that people didn't care too much until lately and nobody's gotten off their ass and done anything about it yet. I like big but approachable problems, which is one of the reasons I'm excited about this.

    FOR ALL OF YOU WHO FELL ASLEEP THROUGH THAT: YOU CANNOT ADD AN INAUDIBLE TONE TO THE MUSIC AND BREAK TUNEPRINT. THE FINGERPRINT IS BASED ON THE LARGE-SCALE PSYCHOACOUSTIC FEATURES OF THE MUSIC. IF MP3 ENCODERS CAN DO IT, SO CAN WE. Maybe not perfectly, but enough to have a fighting chance. THAT'S THE WHOLE POINT HERE.

    jen is telling me to go to breakfast but I want to say one more thing, which is that y'all should also pay attention to the second of our two goals as listed in the FAQ, which is to get this tech and access to a nice, well-maintained central database out into the hands of everybody, commerical and open source, major label and independent, so that people can go do lots of cool stuff with it. I don't want this to end up controlled by a single organization that permits its use only in ways that further its private agenda.

    Hint: I know that there are sekrit batcave startups that are working on the same thing, because we're starting to bump into them.

    Oh yeah. Also like I say in the FAQ, it's not done. No promises. I like the current algorithm; it reflects the wisdom of throwing several other stabs away in disgust. I like the very limited performance data we have. I like the mathematical theory. We haven't scaled it very far yet, though, and it may all come toppling down. In which case we'll pick up the pieces and try again. But I'm confident we'll pull off something cool, because, well, 70% of what we want to do isn't that hard. The other 30% is a bitch and will require cleverness, work, and chutzpah, but even the 70% is going to be a damn useful tool. And this project has started to catch the eyes of some pretty f*cking brilliant techincal people, in my opinion, so I think we're all over that 30%.

    breakfast now. more later :)

    geoff

    PS: if you've emailed me in the past few days, and I haven't gotten back to you, I'm sorry -- things are pretty hectic around here. I really hope to burn through the backlog this afternoon before I get to the slashdot stuff. thanks :)

    Ever wonder if you get a nice warning email before you show up on slashdot? the answer would be 'no' :p

  • by Kickasso ( 210195 ) on Monday August 28, 2000 @01:36AM (#823599)
    with watermarks. They are two distinct concepts.

    A fingerprint is an inherent property of a file, much like your own fingerprints are inherent properties of your fingers. Both kind of fingerprints are used to identify things. A cryptographic hash is a kind of fingerprint. If two files have the same hash they are likely to be identical.

    A watermark is a piece of information artificially added to a file. They are akin of watermarks on dollar bills. There is one difference though. Digital watermarks are designed for difficulty of removal, while watermarks on money are designed for difficulty of reproduction. Watermarks are used to certify autenticity of things. A cryptographic signature is a kind of watermark. It can certify that I, not somebody else, signed some file.
    --

  • They just explain us this is about Cddb'ing songs (auto identification), however bad their resolution might be.
    I agree this could be cool though I have not seen many un-tagged MP3 files.
    But now, if we consider (in this context) that a song is a bit of WAV data of any duration that will be hashed on some way with this system in order to be identified, there could then be another use for this system:
    Couldn't some copyright organism use it automatically in order to recognize any sample they would contain and finally claim some royalties in the name of their orignal creator ?
    After all, this is not quite different from what the ear actually does while hearing a song, especially when it happens to "recognize" a sample.
    If this is the case, then I believe that sample scramblers might become quite frequent in the future.

    --
  • Darn, now everyone will know that it really wasn't me who wrote "Stairway to Heaven"...

    Sure.. that's a problem, but you're missing the REAL flaw in this tool:

    "By throwing away all of the 'uninteresting' parts of the signal, the software is left with only the characteristics that uniquely identify the track."

    This means that every N'Sync, B2B, and Backstreet Boys song will have the same fingerprint! What a disaster!

    --------------------------------------
  • Blockquoth the poster:
    The RIAA will probably love this, since they can embed fingerprints in all their discs, and then just run around sending cease & desists to anyone who distributes a file that contains a fingerprint that says "copyright RIAA member"
    Actually, this worried me at first. Now, I don't care. You see, this just lets the RIAA (or whoever) verify that the MP3 is a rendition of their song. So what? Until they finally succeed in completely rewriting copyright law, I have the right to make a copy (MP3 or otherwise) of anything I own, for my personal use. So if they find I've got this MP3, they still have to prove I am not entitled to it ... exactly the situation they are in now.

    Remember, current copyright law and current technology allow the RIAA to go after people who download songs to which they have no right. But hitting 20 million users is hard, so the RIAA wants to establish that they can choke the servers, too.

    The major anti-piracy use of this, I suspect, would be to set up "stings": The RIAA posts anonymously a song whose fingerprint includes "I am not a legal copy!" to a site trading in songs. Then anyone found to have the copy can be assumed to have downloaded it or copied it from someone who did... but that sounds more like a watermark than a fingerprint.

  • Sure, fingerprinting wont be used for devious reasons
  • To allow this, Napster (or any other) will have to download the MP3 and compute the fingerprint. Downloading all MP3s which are exchanged via Napster will need a huge bandwidth update.

    It would take a lot more than that! Right now, MP3 files on Napster are traded peer-to-peer, with only the filename/duration/bitrate/MD5sum being transmitted to Napster. If Napster wants to get, say, a 10-second sample of each of the songs on my hard drive through my 56k modem (33.6 maximum uplink), it would take something like 10 s/file * 128kbps (approx) * 1700 files (approx) / 33.6kbps = 64761 s (or about 18 hours). And that's under ideal upload conditions!

    And since I don't actually use Napster (I use OpenNap [sourceforge.net]) that makes it even harder. :-)

    Also, what about files that are incomplete? One of my pet peeves about Napster (and OpenNap) right now is that there are still a substantial number of incomplete files out there. I hate downloading a file and discovering that it's cut off halfway through the song. If you fingerprint a file based on a sample, then this does nothing at all to combat the incomplete file problem -- a partial song would potentially have the same fingerprint as a complete song.

    Another similar point: sometimes (rarely) I intentionally omit the first few seconds of a track when I rip from CD, because I don't enjoy sitting through 3-30 seconds of silence. (And I omitted the first couple seconds of Depeche Mode's "I Feel For You" because it's a horrible screeching sound, which I assumed was some kind of joke to scare vinyl users.) If you try to fingerprint the first few seconds of a song, then either (a) you end up fingerprinting silence, or (b) you get misleading results if those first seconds are stripped.

    Likewise, if you try to fingerprint, say, from 30s to 40s into a song, then you fail for any song that's less than 30 seconds long.

  • That was the first thing we tried: have a set of 'classifiers', each of which makes a yes-no decision about the spectrum it sees over a short time interval. Hash together the results of all of the classifiers, and poof. The classifiers were built by automatically analyzing a lot of music to find critera that were stable but widely distributed.

    The problem is wobble (aka fencepost error). What happens if the original, undistorted version of the track classifies as a one with a given classifier, but is really close to flipping over to a zero given just a little push? You lose, that's what, and have to fingerprint both possibilities and put them into the database. And usually there are many different opportunities for wobble if you're checking enough different criteria to build a useful fingerprint. So the next thing to try was only using the foo most 'confident' (unwobbly) classifiers in the hash. But then of course selecting those is wobbly. So you 'debounce' it by having different 'enter set of current classifiers' and 'leave set of current classifiers' criteria. Still too wobbly, and now the code is a complete mess.

    So everything was redesigned to use a different and far simpler and elegant approach. The downside is that we now assume the server is a little intelligent and knows how to do a bit of fuzzymatching. See rant posted elsewhere.


  • I wonder, how important are ALL the bits of a music file, compressed or raw?

    If the fingerprint is nothing more than some sort of hash, then it could easily be defeated by the steganographic trick of manipulating the low-order bits. In a photographic image, this introduces such minimal noise that it's imperceptable to humans, and it gives you a place to hide your own data (i.e., watermarks).

    Could this method be used to beat the fingerprinting algorithm? Or would it introduce perceptable noise to the recording?

    Of course, if the fingerprint isn't a hash, and makes use of this trick in the first place, it's really nothing to pull out the fingerprint.

  • by Chris Johnson ( 580 ) on Monday August 28, 2000 @06:14AM (#823607) Homepage Journal
    Good job- yes, I can see how this would work. You could get 'thrown' by certain sampled music (it's Rick James' "Super Freak"! It's MC Hammer "U Can't Touch This"! It's a floor wax! It's a dessert topping!) in certain circumstances, but on the whole, you've really got something- the key concept, to me, is that it's not about embedding computer codes in the music (yech), but about finding the irreducible information minimum in a snippet of audio.

    I think I can help explain- let me put it this way. I've got a tune (obLink: see URL link above) called "Rain Dragon". There's a point toward the beginning where a 'mutating' synthesiser tone enters with a sort of warpy noise, on a beat that kicks really hard with bass drum and a splash cymbal. The total impact is quite aggressive- the synth sort of bursts in, and does so in a way that defines the range of unusual sounds that patch can produce.

    Take that as an example sound snippet to work with. Now, let's say for the sake of argument that the impact of the splash and bassdrum and synth are all perfectly synchronised (splash and bassdrum are in fact sequenced and are perfectly synchronised to within MIDI spec, synth was a lucky hit that seemed to link up extra nicely). Call the phase of the splash's initial attack A, the phase of the bassdrum B, the phase of the attack of the synth C. These may all be in phase, adding up to a big transient. Some may be out of phase- for instance, the splash may come through unaltered but the syn attack and bassdrum attack may be going opposite directions and cancel each other out.

    This is a very large level feature of the waveform- to alter it you would have to do such violence to the waveform as to render it unlistenable. Nothing you can do is going to make that syn attack and bassdrum attack be in different phase- obliterate the bass and you have a wimpy thin version of the same musical event signature, listen to it on a transistor radio and you have mostly the overtones and some distortions on the same musical event signature, record the transistor radio and it's the same deal- the LARGE SCALE waveform shapes are going to have a recognisable pattern if the music itself is still recognisable at all. In the crudest possible form you'd have to physically edit out certain drum hits or notes to alter the recognition- the crudest possible form for this type of identification is, say, MIDI. If there's a particularly interesting drum fill in something you can sequence it painstakingly in MIDI (not quantising but accurately placing each drum event in time) and get an instantly recognisable 'copy' of the original recording despite obliterating even the very sounds themselves and falling back on nothing but timing alone...

    There's a great deal of pre-existing work in other fields, such as image tracking, that defines fingerprinting as 'imposing a subtle added signal onto the media and then reading it back'. That's a far cry from what you're doing- might I suggest 'bodyprinting' instead? ;) after all, what you're doing is much closer to plunking the 'body' of a music snippet down in sand and recording the large scale attributes. It doesn't much matter what the details are. If you mixed one tune with a different tune, the 'bodyprint' of the one would gradually fade (not be instantly obliterated!) by increasing loudness of the other, and at the halfway point you'd be getting a 'bodyprint' that registered about equally for BOTH tunes (!).

    Now that you have this concept so nicely worked out, what do you intend to do with it? Are you going to give to the record industry the ability to track down unauthorised music wherever it may present itself- most notably, to identify samples used in other songs and bring lawsuits over them?

    I was trying to think of other ways the RIAA could abuse this technology, but I drew a blank- because at this time it's not necessary to _prove_ a music copy is from a particular source, to bring suit. Nobody has argued that britney spears mp3s are NOT the same tune as the original CDs because it's stupidly obvious that they're effectively the same tune. Hence, this process simply adds a level of certainty to a process of identification that's already enough to stand up in court. Is there any likelihood of this level of authentication of a copy becoming necessary in practice?

  • What happens if someone pays cash. That means that they can't trace it back. As a large number of CD's are bought by people who are too young to get a credit card, I can't see them insisting that you pay with one.
  • okay, I just want to point something out really quick: if you take the cryptographic hash of a mp3, then you can fingerprint every unique mp3 (not song) in the world only once and keep it in a database. don't have to recalc it each time. you can take the hash of any mp3 you find and know that you have the same mp3 trusted-authority had when they fingerprinted it.

    ah, you say, but the clients will just lie about the hash of the mp3 files they're serving! well, I was thinking about that, and I think I can see a really simple way to design a 'challenge hash' algorithm. the server asks for a random 1k block from the file, and the client has to send that block and send proof that that data, combined with the rest of the data in the file, could possibly hash together to give the hash the client sent originally. the client can only do this if it's true. now, all you have to do is stop the client from saying one thing to the server and something else to everybody else. presumably you do this by making the protocol to randomly check up on the mp3's you're serving the same as the protocol to download one of the mp3's you're serving.

    these are just random schemes; i haven't tried them or really even thought them through. maybe if I have time someday :)

  • by FreeUser ( 11483 ) on Monday August 28, 2000 @03:54AM (#823610)
    the sound would be so bad, we couldn't copy it. problem solved.

    Ahem.

    I have converted a number of my old vinyl records to CD and MP3 format. It is rather simple, actually:
    • Connect the stereo via the LINE IN port of your audio card.
    • Run software to capture line in to digital file (under Linux, typically .wav format)
    • Play record.
    • Use a program such as xwav to trim the file, removing extrenous crap (e.g. silent hissing) from the beginning and end of the captured file.
    • Use sox to convert to CDR format to burn onto a blank CD, or something like LAME to convert to OGG or MP3 format.
    • Repeat for as many tracks as you like.
    • If burning a CD, when done use a program such as xcdroast or gcombust to burn the music CD.
    • Replace record in jacket and store in a cool, dry place


    ... and listen to the music as often as you like without damaging the master media.
  • > the info might be stored modulated on a 50 hertz signal that we can't hear

    No. It's pattern matching, not stenography. Tuneprint doesn't change the audio in any way. Rather, you essentially send your mp3 to a tuneprint server and ask the server 'what do you think this sounds like?' and the server says 'oh I know, it's kruder & dorfmeister remixing bomb the bass's bug powder dust'. of course you don't send the whole track, just the 'fingerprint' that uniquely identifies it, but you get the idea.

    That means you don't have to modify tracks beforehand. That means you can use it on all the stuff on napster right this instant. And that also means that there's no watermark that a sufficiently clever attacker can strip. (Instead, an attacker would want to subtly change the audio so that the fingerprint is fooled but quality isn't degraded. Psychoacoustics gives us lots of tools to try to stop people from doing this.)

  • But it gets better- since all recorded music HAS this signature already (easily determined off copies of the music), this all just _begs_ for the record companies to start a 'Not a tenth of a second may be sampled from OUR PROPERTY' campaign. After all, using this type of technology you could identify individual snare hits. Sorry that's _John Bonham's_ bassdrum, and we own it...

    The kicker is this- there will still be a huge false positive count. Consider this- I own a Proteus/1 synth. Some sounds I have modified and altered, but some I use 'stock'. Play a certain note or melody with a certain sound and BAM- "Excuse me, we can legally prove you sampled 'j_random_80s_band', see you in court". Playing an acoustic instrument, it's very unlikely that you'll exactly duplicate a waveform simply by playing the same notes, but with sample-based synthesizer modules that ship with ROM banks?

  • For those of you who got the CueCat barcode reader from Radio Shack (free giveaway [slashdot.org]), you may have noticed mention of a "convergence cable" which hooks from audio out of TV/VCR to audio in on computer. Computer then "listens" for secret information from ads/shows to link you to Web-related info. Note that the cable was not provided with the Cat, but available separately at RS.

    I've been speculating on this for a couple of days. I'm wondering if the Digital Convergence [digitalconvergence.com] software :CRQ [crq.com] doesn't already do something like the software mentioned in this article. Let's say an ad or show intro is fed into a signature (watermarks???) generator at 11KHz sample rate (to keep CPU usage low) then encode that signature as a CueCat trademarked barcode (can you trademark a class of barcodes?) and matches it against a list kept on a central server.

    If my guess is correct, then the methedology mentioned here has already been done and put to the worst use for a new technology - marketing.

    Just a thought. Expand on it if you can. Anyone else with insight into the :CRQ methodology?

  • I see something that can be potentially used with the ncaCRAP + naspter crap, technically users of mp3s must buy the cd inorder to listen copy/create mp3s, so if the record companies imprint a unique id on each cd that can be downlaoded into the users registry, that can be checked against the mp3s, and if they dont match, then mp3 doesnt play. whaddaya think?
  • by Chris Johnson ( 580 ) on Monday August 28, 2000 @06:46AM (#823615) Homepage Journal
    What superfluous header data? You're looking with a microscope, and you ought to be looking with a fisheye lens. Covered in vaseline. In a snowstorm ;)

    Inspect the whole file all you want- you might even see interesting wiggles in the waveform which are of course exactly the sort of thing this will pick up on. You can go in and invert chunks of those waveform wiggles, and that will render that little snippet unmatchable with the original tune- at the expense of making the audio go sputter sputter sputter. Pitch-shifting the whole tune up about 2 octaves would work too :) or timestretching it to about twice its normal length- maybe only 1 1/2 times its normal length. That would work if you like slow dancing ;) most effective? Well, you know how some mp3 files ripped off CDs go BZRRRP every now and then 'cause the CD player choked? The data formerly existing during the section where it goes BZRRRP is rendered TOTALLY UNMATCHABLE by this technique ;) therefore you can completely destroy the fingerprint by simply arranging for the rip to be 100% bzrrrrp. I think I can safely say that this would be a completely effective way of eradicating fingerprintability, at least until they start fingerprinting CD failure modes :)

  • A fingerprint is a watermark, where a watermark is not a fingerprint:

    If you removed a fingerprint from the work, you essentially destroy the work. At least, from your above statement, that digital watermarks are designed for difficulty of removal.

    So you can use a finger print for just about all the purposes of a watermark, when you can't use a watermark for all the purposes of a fingerprint.

    Does that make sense?

    The nick is a joke! Really!
  • I'd be interested in seeing more description of how this works...

    AKA, in fingerprinting, you choose interesting features and landmarks because you can see the *entire* fingerprint at a time, and not by tracing the grooves; yet as far as I can see, because you're sampling across time/frequency, you're forced into something analgous to trying to find features by tracing the grooves in a fingerprint.

    I can see where one would choose instrument switches, pauses in singing, rhythm changes, or something else that is suitably obvious. The only problem is how the system can identify these portions in song... I'd love to see the 'math' behind this, even if it takes a couple months of reading to actually understand it.

    Of course, to cheat would create a fingerprint using some sort of GA, and select the genes that creates the most *useful* fingerprints, in terms of categorizing and identifying, and code those genes into a formal program...

    The nick is a joke! Really!
  • Why yes, I agree..

    But then trolls like you like to think you can think!

    why dont you first READ then type. if you notice that the addition of an infrasonic fundamental (if you have more than 32 brain cells) will alter the total fingerprint of ANY audio source. squeeze the audio out to the analog world, add the fundamental- re-encode. anyone with decent grade audio equipment can do this with no detectable changes (to the human ear... including the idiots that say they can hear the difference between brands of speaker wire).

    Seeee.... in the analog world, we can undo the things that the best digital mage can do.

    Also you can do a small time-shift compression that would be in-audiable and change the fingerprint.

    I can think of at least 20 ways to defeat this without straining!
  • Encode the music - transform it in reverse and then distribute it. make your mp3 player program play it backwards or just run it through a reverse-reverser (well you know what I mean) or take the mp3 and rot-13 it... I have several digital world ways of defeating it.

    A radio station time comppressor would break the fingerprint too.. (notice how the radio station plays the song slightly faster than it is on cd?)

    Good idea, but it aint ever going to be foolproof.
  • Simple -- fingerprint the audio *after* it comes out of the mp3 player. But yeah, you can always make your own private format, and just encrypt the data. That's fine for sharing stuff with your friends, but doesn't help too much if you want to put the songs on Napster.

    As for time compression, yeah, that's one of the distortions you have to make it robust to, one way or another, just like volume change and mp3 compression. I have some strategies in mind for this but haven't run a lot of these kinds of tests.

  • ...at least in the graphics world. Digimarc (http://www.digimarc.com/ [digimarc.com]) watermarking has been included in Adobe Photoshop since version 4 (maybe earlier?). The watermark can be applied to the image with minimal loss to image quality, and is very difficult to remove without seriously damaging the quality of the image.

    I've wondered when music companies would start doing this to their recordings. Had Napster been able to tell the difference between freely-distributed music and illegally copied music, I'm sure they would have been much better at covering their ass. I think this is a Good Thing, and it should have been the responsibility of the record companies to come up with a similar scheme long ago. *Everyone* else in the world is expected to identify their copyrighted material as such. Why shouldn't they be?

  • From the FAQ:

    >What is Tuneprint?
    >
    >The first goal of the Tuneprint project is to develop an audio fingerprinting algorithm, that is
    >to say, a computer program that can take a few seconds of music, calculate some kind of unique
    >'fingerprint' of that sound

    ...snip...

    Hmmm, haven't quite worked out the algorithm yet? A little bit like that old annoying Fermat's Last Theorem, isn't it?

    "I'd show you the algorithm, but I haven't got space on my Web Server..."

    Back to sleep...zzzzz
  • The University of Helsinki is propagating hoaxes, too!?

    Cool.
  • ...those dirty fingerprints on my CD's.
  • This all is about fingerprinting an audio file...

    That reminds me one of the methods used for speech recognition: image recognition.

    You convert your sound to Fourrier Domain. x=t, y=f, z(grey tone/color)=F(y)|t=x. You then have a 2-D color/grey map to recognize. And image recognition is far further than sound recognition.

    So, imagine you
    - FFT the music to a picture
    - create a 2-D simplified image

    All you still have to do to recognize the tune is check the 2D image of the unknown tune agains the saved 2D images (allowing stretch/noise,...)
  • See mine and other posts about difference between fingerprints and watermarks...

    I can recognise a piece of audio (from just a few seconds mostly :-) whatever file format, whether been played on the radio, CD, MP3 or 8-track, and with a 90% accuracy.

    So why can't an algorithm.. this isn't a watermark, it's a fingerprint. Read the site carefully...

  • "Artists: You can use it to stop people from putting their name on your band's mp3's and distributing them as their own, or you can use it to embed lyrics, links to your homepage, and stupid banner ads in mp3's."

    There's something REALY innovative going on here. For how do they fit a banner inside MP3 files? And once inside, how do they output those banners to the speakers? To top it all, these banner ads are STUPID! Wow! What guys can come up with these days to get counter-hits! In the old days, we just made a Perl-script or something, but this is REALY innovative!

    However, how about adding *Ads* in MP3's that you can't hear with your conscious ear, but is dictating your life *UNCONSCIOUSLY*. Just think about it: One day you wake up from listening to music all night, and you go to the nearest MPAA/RIAA Borg-HQ to seek employment.

    There's just endless possibilities, isn't there? :-)

    - Steeltoe
  • Can you recognise a top 40 song when it's played on the radio?
    Do you know every lyric, note and chord change?
    Why shouldn't it be possible for a computer to do the same thing? Remember: it's not a watermark-- it's more like MD5.
  • I'm kind of doubtful how well this technology works...

    But, if it does work as well as they say, what about displaying the song's fingerprint next to each song on an OpenNAP server? This would make it easy to verify that it's the right one before you download.

  • You would need to sample the whole song in order to accurately fingerprint a track since you have to consider the possibility of different remixes.

    Dance and Trance music frequently comes with several different 'edits' of varying length depending on where they will be used. I'm sure some of these edits will share identical sections with lengths in the order of 1 minute or so.

    This doesn't look very easy to do.

    What would be nice would be the ability to search teh web for mp3s that sound like the ones you listen to but aren't. :)
  • >NOTE: By reading this post, you have agreed to run around the room which you are currently in, flapping your arms, and sqawking like a chicken.

    Everyone in my office thought I was nuts, but a license agreement is a license agreement.

  • Twas bound to happen ...

    CDDB + My.MP3 (+ Napster?) + bdiff = "Fingerprinting" of Audio Files.

    Now we just need to solve something like:

    IMDb + AltaVista/Corbis + bdiff + WhoWhere?/AnyWho

    ... to get Natalie Portman's home telephone number.

    (Not for me, you understand, but I believe there's a market out there ...)

    Regards, Ralph.
  • Hmmm... Even if such a system were possible, it would hardly be effective to trck people posting MP3s. No one's going to stop you from plugging your audio CD player in your sound card. I don't think any digital ID would survive a d/a -> a/d conversion.
  • If this is real, I see a few problems with the concept:
    1. If you just take a fingerprint of a small snippet of music, a few seconds in length, the fingerprint is going to be different for different parts of some songs. There are songs that have parts that sound very different than other parts.
    2. Sampling: some songs are known to contain snippets from other songs. Won't this confuse the algorithm? This is both the same as and the opposite of my first point -- some songs can have drastically different sound in places, and some parts of some songs may sound the same as some parts of other songs.
    It seems to me you'd really need to process the whole song into little fingerprints that you jam together to get the fingerprint for the song, and then the whole suggested ability of using fingerprints to identify the beginnings and endings of songs in a stream goes out the window. As far as my point 1 above, there are songs that confuse my ear enough to make me think the song has ended and a new one has begun, or have parts that don't even sound like part of the same song, so I think any system that attempts to use these methods to distinguish songs in a stream is going to be prone to failure.

    However, as far as fingerprinting entire songs, this sounds like a very promising idea. As other slashdotters put it, it's really melody that you're cataloging with this system, or basically making a software equivalent of an "ear for music".

  • From the FAQ: "The first goal of the Tuneprint project is to develop an audio fingerprinting algorithm, that is to say, a computer program that can take a few seconds of music, calculate some kind of unique 'fingerprint' of that sound clip, and match the fingerprint against a database to determine the title and artist of the music the clip came from, as well as the time offset of the clip into the music"

    I'll name that tune in seven...

  • by Anonymous Coward
    Check out http://www.etantrum.com, not only have they done all this and more, but they've had it since March(you can download it now, so it isnt vaporware), and they're putting it out under the GPL.
  • I agree this could be cool though I have not seen many un-tagged MP3 files.

    *blink* *boggle*

    Huh? Man, please tell me where you're getting all these tagged MP3 files from! The vast majority of the MP3 files I've found on Napster [napster.com]/OpenNap [sourceforge.net]/mp3.com [mp3.com] are untagged -- and of the rest, a significant number don't even have the "sync" (whatever that is) that mp3info wants, so I can't even add the tags myself!

  • Actually this came about because on napster some one was planting Cookos. Cookos are songs which are tittled as the original song but when you download tem you have someone elses song. Or they start out as the original song then turn into another peice. This whole thing was started by some guy whos wife is a musician whos record is selling bad. I think that the MPAA can't use this to stop mp3 songs. Unless they get all mp3 players to only accpet songs that have been verified with this process and have a permission flag raised. Yeah right they are goingt o get open source players to do this(XMMS), maybe on stinky windows machines. This process will just further encourge the trading of mp3 songs, since the user knows they are getting the real song. I don't see how this can work for the MPAA. MPAA, just a bunch of companies trying to own the future of entertainment. (monopoly). Imagine that, a whole bunch of bad MPAA movies and songs.. Forced to watch them because they are the only people who are allowed to make movies and songs. Ariel V. Rosa
  • Absolutely. All you'd need to do is record enough of the tune off _whatever_ you're listening to it on (mp3, ATRAC, mp3 of atrac of 22K wav) and you'd have enough of a pattern to give to some database. All the database needs is the patterns of any music it hopes to recognise, no matter what the source, and ability to search through that quickly. Result- there would be a place you could ask 'What is this? *play snippet*' and it would always be able to tell you.

    HAPPY FREAKIN' THOUGHT
    Hey, there would be nothing stopping me from putting _my_ music's patterns on such a database! There would be no legitimate argument to _prevent_ me doing it and every reason to do so (same for the majors- they'd put up everything they could). Then, anybody no matter where they are could get information on where to get my music, even if they only heard a snippet and went 'That was neat, what's it from?'

    VERY cool. Currently, without this, it's a lot easier for random music listeners to identify stuff that is pushed by the major labels. Add this ability to effectively free-associate and still get results and music-recognition becomes effectively random access, reducing the importance of the mainstream industry. (Something I would _love_ to see...)

    As a final note, imagine humming or singing into a mike, making a snippet of that, 'fingerprinting' it and then sending it out as a search! On the one hand you could make horrible noises just to see what music out there contains horrible noises. But it goes a hell of a lot deeper than that- for instance, there's a song, "Green-Eyed Lady" that I used to have (was hell to find, too). I never remember the band's name, they were a one hit wonder, but I can still remember the neat spooky feel of the tune- and, more relevantly, I remember that at one point the lyric is 'green-eyed lady, windswept lady', and 'windswept' is articulated with unusual clearness. I have to wonder- if I made a little recording of me trying to sing 'windswept' with the intonation and articulation I remember- would it, on some level, match the original? Would it return me the information on the song itself?

    The harder you look at this idea, the more it starts to look like the best sort of science fiction fantasising. Hum the refrain of the tune you can't remember to the computer and it looks through its databanks and (depending on how well you hum!) it tells you what the tune was, better than most humans ever could. This alone would make the idea a killer idea- the added decentralisation that it brings (I can submit all my music to the database and anyone can search for it- just as I can upload a web page and anyone (mostly) can connect to it) makes it even more exciting.

    _Good_ _job_ :)

  • Nobody has argued that britney spears mp3s are NOT the same tune as the original CDs because it's stupidly obvious that they're effectively the same tune.

    Stupidly obvious to a HUMAN listener. And there, my friend, you have your answer. The RIAA will be able to identify songs in bulk automation.

    Hamish

  • All of you are thinking about digital fingerprinting, but in fact you could add the fingerprint as analog data. Like if you add tones that are of higher frequncy then you can hear or add a numeical value as a tone and have it play really fast so you can't hear it. This would make it harder to remove and also hard to locate the fingerprint.
  • It strikes me that, for the most part, if I took my 8-track ADAT master tapes and made new mixes of tunes, 'polishing' the old mixes and trying to keep everything the same- the new mixes would match.

    This naturally leads me to wonder what differences would end up matching, and what would cause a failure to match. I think it's pretty safe to say that reducing or increasing the volume of an instrument slightly would not lose the match- the basic shape of the waveform would be the same, only the proportions would be slightly different.

    MOVING an instrument, say from right to left in the stereo image, would probably obliterate the match. Both channels would be significantly different. However, a minor shift in the middle of the stereo image would _not_ lose the match.

    Finally, if I had doubled instruments (for instance, I can refer you to mp3.com/chrisj [mp3.com] for examples- "B17 Flying Fortress" has doubled basses, and "DeHavilland Mosquito" has doubled acoustic guitars) panned hard left and hard right, I could effectively obliterate a match while leaving the music 'unaltered' for a listener (sort of). This is because on the tracks I mention (and on many other tracks that exist) this doubling technique is used to thicken the mix by playing two takes of the same part, as identically as humanly possible. Musically, there's no particular reason one track should be on one side and not the other. If you swapped them, the musical effect would be basically nil, barring minor glitches that would be registered as coming from the other side now. But when you play acoustic or electric instruments the waveforms are not as predictable as synths- so for the purposes of the fingerprinting, the original track and the track with doubled instruments reversed would be _hugely_ different, even though to the listener they would be musically alike.

  • Way cool. Just one question: how difficult would it be to match a fingerprint of a song from some other input than the original song? i.e., could you get it (or some similar algorithm and search combination) to recognize you whistling the theme to some song? I would imagine that you would really only have to turn down the hit accuracy limit, and probably correct for some temporal distortion (the typical listener will not be able to hum a song at exactly the same tempo after it is done), but this obviously depends on the structure of the psychoacoustic model.

    Just throwing ideas around...

    On a similar note, how badly _can_ you manipulate the sound before the fingerprint gets whacked?

    Another useless suggestion: could this fingerprint be somehow rendered into a useful visualization for a song? Presently there's the spectrum analyzer and the scope, but if the fingerprint incorporates other elements (I'm not that experienced in audio, so I don't really know what's left...), could you display those? That would be cool -- and, if done correctly, informative.

  • relevantly, I remember that at one point the lyric is 'green-eyed lady, windswept lady', and

    That's what search engines are for. Just plug "green-eyed lady, windswept lady" into Google and you get the lyrics to the song which informs you it is by Sugarloaf and on the CD, Have A Nice Decade: disc 1. You then plug this info back into a search engine or Gnutella and find the tracks. Sometimes, when I'm bored, I amuse myself with this method by trying to start a download of a song I hear on the radio, not knowing the title or artist, before it finishes. Ok, so it's not that exciting.

  • by antifuchs ( 225876 ) on Monday August 28, 2000 @12:56AM (#823645)
    "<i>The fingerprint doesn't change even if the sound is compressed, converted to a different file format, broadcast over the radio, ...</i>"

    ...sung at a karaoke event, covered, remixed, hummed by any being with vocal chords, played on a bagpipe, and so on.
    --
    this post was brought to you by Andreas Fuchs.
  • Umm, dude, read the site.
  • So how long before MPAA starts trying to use this to stop mp3's. Imagine mpaa using this to track who stole what song?, maybe a uniqe tumbprint on every CD. Tech wise that would be kinda of hard.
  • and then it's discovered, every Brittany Spears song has exactly the same fingerprint.
  • at University of Helsinki if I'm not wrong.
  • Coming up with a private algorithm to "fingerprint" audio files sounds like a fun thing to do, and you may well have clever ideas.

    Some of the uses you suggest would work just fine, as long as no one has the incentive to modify the file so it still sounds fine but has a different fingerprint.

    Keeping it a "secret" and selling it to the clueless music industry for "watermark-like" purposes might make some money and work for a little while.

    But as soon as people know how the algorithm works it is clear that someone will come up with a way to subtly change the file so your particular scheme is not a reliable fingerprint. I.e. the modified file will have a different fingerprint but it will still sound good enough to folks that the music industry will want to cry foul about the technique. And people will figure out how the algorithm works whether you want them to or not. That is just the way the world of security works.

    --Neal

  • Seriously, how often does this happen? I mean, besides your theft of Stairway to Heaven...
    --
  • if I remember right, one of Napster's points on appeal was that it was technologically impossible for them to only block the copyrighted material in question. Could the MPAA force Napster to recognize their "fingerprints" and deny users the ability to download music with those specific fingerprints?
  • "It takes the unique 'fingerprint' of a sound clip, which can then be compared to a fingerprint database to get more information about the clip"

    Commercial Applications (Pest Control):

    • "To speak to a friendly customer service representative, say one, to exit say two."
    • "One"
    • "I'm sorry (first name here) but you are not yet one of our current members. Please sign up and activate your account through our easy online billing system by saying yes."
    • *cough*
    • "I'm sorry (first name), but that option is copyrighted, please try again."
    • "Two"
    • "Thank you for calling (business name)."

    Personal Applications (Solicitor Control):

    • "Hi there! You've reached (your name)'s messaging system. To speak to the owner say "yes", to exit say "no".
    • "Yes"
    • "Hi (their name), do you realize (their company) has tried to convince this machine to change it's calling plan (n) times this month? Please stop harassing me."

    "He who has all the answers has stoped asking questions."

  • I can see it coming:

    You buy a CD at your local recordstore or via the Web and pay with your creditcard. The CD itself has a unique serialnumber which is also watermarked into every track on the CD itself.

    The recordcompanies now know who you are and which CD you bought.

    Now, you lend (or even give it as a present) to a friend of yours, who rips a track, encodes it to MP3 and puts the file on gnutella.

    Some weeks later, you'll receive a letter from the RIAA and though you don't have the CD anymore this will make it even harder for you to prove your innocence.

    I don't want that.

  • The RIAA will probably love this, since they can embed fingerprints in all their discs, and then just run around sending cease & desists to anyone who distributes a file that contains a fingerprint that says "copyright RIAA member"

    And I'm not sure we'd even have much room to complain about it, at least not from a legal standpoint...
    If they only complained about the distribution (including publicly posting them), then we couldn't yell and scream about "fair use!"
  • The fingerprinting aspect of this doesn't seem as interesting as the following, taken from the tuneprint website:

    ...you can use it to embed lyrics, links to your homepage, and stupid banner ads in mp3's.

    Fascinating. Now you can embed stupid banner ads directly into the audio content of an mp3. This is both cool and scary.

    The only problem with this embedded content is that it would have to be enabled in every different mp3 player in existence. Do I want my XMMS enabled with tuneprint so I can read lyrics? What if the cost of those lyrics is that I have to look at advertising text interspersed between the song lyrics ("Don't Fear The Reaper / Coke is It! / Come on now").

    No thanks, I think I'll take my music without ads, and just hum along.

    --Jim
  • "didn't Negativland do something along those lines already?)"

    The u2 thing? dont think it was settled in court. they made a lot of noise about it, but its still banned (although downloadable at www.negativland.com). I think that it was 2-live crew who won their case. this could have been used to get negland off, but it came too late. the negland thing was complicated by it being both a rich label (island) and a rich celeb (kasey casem) who were against them, and negland arent rich.
  • I see a hard time coming for all euro-dance acts around... "Sorry sir, that's copyrighted."

    Biff! Kazamm!
  • motion picture ASSociation of america. i think u mean the riaa
  • A lot of people must be in the same boat as me - large mp3 collection, constantly adding to it, not keeping track of what I already have - end result being multiple copies of the same song, albeit with different track names, filenames, encoding rates, etc... something like this would be a godsend to identify duplicates (not easily doable, since filesizes differ on the same tune often, w/ leading / tailing times differed & tracknames differing, along w/ encoding...)
  • Could the MPAA force Napster to recognize their "fingerprints" and deny users the ability to download music with those specific fingerprints?

    To allow this, Napster (or any other) will have to download the MP3 and compute the fingerprint. Downloading all MP3s which are exchanged via Napster will need a huge bandwidth update.
    Moreover, I think that computing a fingerprint which doesn't change with compression/resampling can take many time (5 seconds on a Pentium III class ?). They will need a super computer to compute each MP3 fingerprints.

    The last solution is to force user to compute fingerprints and to transmit them. But, with an open source project, a user can transmit a false finger print. And even if there were a way to force user to transmit a true finger print, most users will go to another system like GNUtella where there is no central point.

    Well, I think that this kind of technology could help web hosters to find illegal MP3 (but do they want to ?). But what I would like to see is a new CDDB system which doesn't rely on some mystic parameters on the CD (which aren't very reliable) and which works with MP3.

  • Conceptually it's quite a cool idea to be able to reduce music down to this 'fingerprint'.

    Technically to me it seems more like a message-digest operation or checksum. Using the word fingerprint suggests that no two files have the same fingerprint, which would be useful in piracy prevention.

    In a perfect world where chocolate has negative calories, we'd digital music downloads where fingerprints were added to downloaded files to identify the user that paid for them. That way they wouldn't want to distribute them since there was a fingerprint that could trace to them.

    Unfortunately all the theories about adding a fingerprint which can withstand recompression are pretty unfounded. I've written simple examples of this that add fingerprints to bmp and wav files by manipulating small details that are indetectable to the human eye/ear. Ironically these are the very details that systems like jpeg and mp3 oblitterate.

    This system however does none of that and isn't that much different from the checksum that audiocatalyst puts at the end of a comments field.

    Also the algorithm to calculate it seems like it must be very complex, and I really dread to imagine how long it would take to search through a several thousand file mp3 archive for a certain song.
  • From their page:

    Artists: You can use it to stop people from putting their name on your band's mp3's and distributing them as their own, or you can use it to embed lyrics, links to your homepage, and stupid banner ads in mp3's.

    Darn, now everyone will know that it really wasn't me who wrote "Stairway to Heaven"...

    Really though, the last thing I need a banner ad in is an MP3, I get enough of those everywhere else
  • Funny, I was just thinking how easy it would be to bypass/fake such an algorithm...
  • A few years ago there was a lot of discussion about fingerprinting copy-protected images (I think it is called "watermark", but I'm not sure). There are quite a few algorithms out there, but as far as I know none of them is good enough. E.g. the watermark does not "survive" when an image is rotated 1 degree, the difference is not visible.

    I'm no expert regarding this, but I'm pretty sure there are similar transormations on audio files nobody can hear, but which will destroy the fingerprint.

  • converted to analog, shifted phases, played through a tin-can phone...
  • Well, one thing seems obvious: whatever tech the MPAA and RIAA can agree upon and implement _will_ be anticipated and "transcended" (to use asshole Sony CIO Steve Heckler's term) by better coders.

    First, SDMI is getting nowhere fast. They can't even agree on definitions for specifications for technology, so SDMI likely won't see any reality. (Part of this has to do with the fact that many of the members play both sides of the street - like Sony, player maker _and_ a music company.)

    Second, there _can't_ be any technology that both survives decoding without significant degradation of signal _and_ is retrievable from ripped files. By definition, the digital "watermark" _has_ to be stripped by playing - otherwise it will have to distort the audio/video output. Yeah, it can be done, but who would buy crippled CDs or DVDs? Better technology will increasingly favor freedom, not continued indentured servitude of artists and their consumers to obsolete, bloodsucking media conglomerate pimps, despite their @%*#$ lawyers!

    Sony was off my vendor list last year, because they don't provide any PCMCIA Socket Drivers for anything but Windows (no Linux, no OS/2, just MS) but this seals it: I'll never pay Sony another dime for anything. Same goes for the rest of the MPAA and RIAA. Free movies and music, forever... or, at least, until they back down and get real.

  • I agree. At least with cryptographic hashes there is very sound theory at work. This is based on trail and error. And not only that, but it's trying to be inclusive instead of exclusive.

    A cryptographic hash takes an exact set of bits to result in the hash, a very exclusive operation. It's excluding all of the wrong values. If any of the bits in the configuration are slightly off, then there's a very high likelyhood that the hash will be completely different.

    I believe common sense will tell you it is easier to do that than to loosely define a large set of bits that will result in the same hash. And furthermore, the boundary for the bit bag is defined by human intuition (psychoacustics) and empirical research (trial and error).

    Granted, lots of hashes are created by people tinkering with an algorithm, but there are excruciatingly detailed and highly refined tests that'll tell how good the algorithm is. The development of the algorithms and tests are based on information theory and statistics.

    What are the tests for the audio fingerprinting based on? The answer appears to be our grossly underdeveloped understanding about how the human mind processes sound all the way through to making comparisons between different samples of music. I seriously doubt that something hinging on human intuition and interpretation as part of the algorithm can overcome the fact we don't understand how people's mind work to any acceptable degree.

    I spent some time thinking about a way to fingerprint music, and it is a very hard problem to come up with something that captures the essense of our perception of music. Simply doing that is a huge accomplishment. It involves distilling down human methodology for sound recognition, something that is based on a massivly parralel biochemical neural network, or at least something we can only crudely model as a neural network, into a mathematical formula that can be implemented efficiently on a processor.

    It is much more likely this won't work because there's a fundamental difference in the platform these two tasks are done on: subjective analysis by a person of some music as to what is and is not the same and some convoluted algorithm attempting to approximate that process on a von-neuman style computer. I won't hold my breath, no matter how cool the idea is. And it is very cool.

    They would probably have better luck building a neural network and training it against a large set of people to attempt to capture their collective equality operation

    I doubt they will reach their goal, but if they can come up with an algorithm that reasonably sorts music the way people do and to satisfactorally compare music, it will still be very powerful. I personally would love to see such technology be successful.

    The ability to emulate how people discern music and to detect differences between samples is tremendous. If nothing else I can finally catalog my music intelligently and to seek new music that will be something I very likely will dig.

    Imagine listening to a song that really scratches an itch, taking the fingerprint of that song, even if it is from the radio and getting songs that scratch the itch the same way from a database. How cool would that be?

    Not only that, but we'd need that sort of technology in order to find musicians and bands (there is a distinction) that we like if there isn't a huge mega-media-greedy conglomerate driving the mindshare and play time. Combine that with reputation certificates and something like SPKI [ietf.org] and you've got a very successful way to implement the Street Performer's Protocol [counterpane.com] (see the recent /. story [slashdot.org])

    A reputation certificate is similar in concept to what eBay does with its sellers in capturing the history of the person to create a mark of how reputable that person or entity is, but it goes one step further and cryptographically binds that information to to an identity. See the SPKI document above for detailed information.

    This is stuff that needs to be worked on and I salute tuneprint for working towards that goal.

  • MPAA = motion picture ASSociation of america

    RIAA = recording industry ASSociation of america

  • Yes, but what if the file is only downloaded partially? IE, a napsterphiliac doesn't get the last 3K of a song so that it doesn't have the fingerprint?

    Of course, there will always be programs to strip such fingerprints.

    -------
    CAIMLAS

  • I cannot believe anyone who's read through the entire site does not see this for the hoax that it is. It is riddled with contradictions, nonsensical vaportech, and inside jokes. Goddamn, even their barcode contains 31337. From their faq - and these 2 quotes are by no means the sum of all the evidence.


    Ain't life beautiful? I think I'll change my handle to Loki [dutchie.org].

    and

    Q: How does it work?
    A: The answer to this question changes a lot ...


    That's true. The explanation does change a lot. Ok, a lot of us have been thinking slashdot has become something of a joke, lately. But goddam, is it really necessary for Hemos - Hemos! - to drive the point home?

    Sheesh. There is absolutely nothing on that site that suggests anything other than a hoax.
  • A fingerprint is a piece of data computed from the song, not something embedded in the song itself. A proper fingerprint would somehow capture the essence of the song, so even if it were filled with static or missing the last minute it would still work.

    A watermark is data embedded in the song itself with the idea that it is hard to remove the watermark from the song without destroying the song.

  • Aw, 'comon. You don't expect people to ACTUALLY visit the site before commenting on it. I prefer to give my professional opinion based solely on the summary.

    I think its kinda funny, actually. It shows you which posters have a clue and which ones are just trying to sound informed.
  • Also, I wonder if this could tell if there was a problem with mp3 you downloaded. It would be a reverse process. You put in the details of the mp3 and it compares an archived digital signiture to the one you've got. It could all be transparent as well, and automate my downloading process to get the best mp3 out of the bunch on napster.

    Even the samurai
    have teddy bears,
    and even the teddy bears

  • bah!

    if the audio sounds different to us (humans) it will HAVE to 'sound' different to computers as well.

    yes, we humans can recognize that a compressed (analog) version of a song and the uncompressed version are esentially the same.

    but a computer who's just doing "math stuffs" to the bit patterns cannot hope to compare 'similarity' the way we do.

    this whole premise is a joke. and not a very creative joke, either.

    --

  • The fingerprint doesn't change even if the sound is compressed, converted to a different file format, broadcast over the radio, and so on.'

    It'll change when I extract it to WAV and inspect the entire file for superfluous header data!!!

  • If that was really Napster's defense, then they are not very smart, and one wonders how somebody who couldn't innovate the simplest of concepts could even operate a compiler. I could make a 99.9999999% foolproof algorithm for allowing only songs which the author gave permission to distribute to be listed on Napster, in my sleep. The judge also came up with one, and she's not even an engineer.
  • This has been around for awhile. SESAC has been using digital fingerprinting to survey radio play for awhile.

    There are lots of interesting things going on right now in the internet music realm. Check out http://www.mongomusic.com, check out the 'sounds like' feature. Something other then collarabitive filtering! Woohoo! And it works!
    ...
    . ""The future masters of technology will have to be lighthearted and
    . intelligent. The machine easily masters the grim and the dumb."
  • Wow, this could make radio stations obsolete. Instead of having to call the radio station to ask the name and artist of the song playing, you could have a hand-held device that will generate the fingerprint and look it up on the Internet.
  • To put a song into the database, you break the song into itty bitty pieces, fingerprint each piece, and put each little fingerprint -- labeled with the track it came from and the offset into the track -- into the database. So what you're getting back from the server is a series of messages like "I think you're at offset foo into track bar." The server will be wrong sometimes, maybe even a lot of the time, because of samples that we haven't correctly dealt with, silence, the inherent loopiness of music, distortions that tuneprint isn't sufficiently robust to, gnomes, etc, but only for the correct matching track will the "offset foo, offset foo+n, offset foo+2n" series make sense.

    Anyway, so you can identify song boundaries either by the length of the track in the database based on your current offset, or by just waiting until the track names you're getting from fingerprinting the little bits change (and you pick up the "offset foo, offset foo+n, ..." pattern again.)

  • Sure it's possible, it just hasn't been done yet.

    I didn't say it was impossible, just impractical.

    --Jim
  • i saw something on freshmeat a few months ago that could take in a wave source like you described and break it into tracks based on silence between songs. which of course means that sgt. pepper would only be like two tracks, so it can also break it up by track length. wish i could remember what it was called...
  • With happy gems within the source of the FAQ [tuneprint.com] such as

    <!-- Bypass record companies to pay artists directly -->

    maybe they have an ulterior motive! :o)

    --

  • As others have said, the fingerprint is much like a checksum -- it's calculated based on the tones produced by the song itself. They're right in saying that the fingerprint will be consistent for compression, etc. so long as the song is not changed significantly. How long would it take for geeks to work around this if it were implemented to restrict the distribution of copyrighted material? Less than a day. Consider the following: I write a program that prepends X seconds of random noise to a sound file, thus skewing its fingerprint to render it unrecognizable. On the other end someone else clips out that bit of junk and has the original sound file again. Writing an app to do this would be dead simple. Further, just compressing or encrypting the sound file would be enough to wreck its signature. Bottom line: cool idea but it won't even slow us down if people try using it to control us.
  • As soon as you play any mp3, a fingerprint check is performed. If you are listening to a copyrighted sound file, a message is sent to the RIAA who can check whether you are a legal owner of the sound file. If you aren't, they will melt your sound card DSP remotely.
  • Here is a simple idea to fake a server. Suppose you want to serve a MP3 of Metallica. If you transmit the right hash, you will be banned. Well, the idea is very simple. You want the hash and a 1k random block of the file ? OK, I take an MP3 free of rights, I "rename" it to the target MP3, I send you its hash and the 1k block you want. Everything is fine. And when I must send the entire file, I send the true one.
  • It has been about five years since I did the vinyl to CD thing (I've since converted the resulting CDs to MP3 and OGG), so I don't recall exactly what program I used at the time. It may have been a command line utility.

    I did not dd from /dev/dsp -- I don't know if that would work or not.

    However, there are any number of small X utilities that will allow one to capture audio from the microphone or line-in and save in some format (which, if not .wav already, can be converted). A number of audio editing tools allow recording (select the record input device using your mixer program). SLAB [slabexchange.org] is one possibility, others include a command line utility [tudelft.nl] which someone else wrote to do exactly what I did (convert vinyl to digital), a simple command line recorder [alaska.edu] (this might be what I used), another recorder SCAR [pdamusic.com], another tape/LP converter kit [twu.net], Vsound [zip.com.au] (allows you to capture audio from apps like realplayer). Finally there's "yarec" and "xwav" (which I also used as I recall), but I cannot find URLs for either right now.
  • You buy a CD at your local recordstore or via the Web and pay with your creditcard. The CD itself has a unique serialnumber which is also watermarked into every track on the CD itself.

    This would mean that every CD is unique -- not just in some data track that can be tacked onto the end of the CD postmaster, but across the entire audio content of the CD. In other words you couldn't stamp every CD from the same master, you'd have to uniquely burn each copy. Doesn't sound economically feasible (thank goodness).

    --Jim
  • Actually, working for a company where we dupe a -lot- of CD's (not CD-R's), I've found that adding serialization to the content is -very- easy, retailing at pennies per copy.

    Are you talking about a single serial number tagged onto the end of the mastered data, or are you talking about actually changing data that has been stamped by the master?

    To add serialization data that's intertwined with the audio content (no -- not intertwined -- actually contained within) would mean having to tweak lots of data distributed all across the audio content of the CD. Could your technique do this?

    --Jim
  • This tool is not a 'fingerprinting' as such. It is a method of anaylzing the natural patterns in a piece of music. Each piece of music has distinctive patterns in pitch, volume, timbre, etc... Not seeing the algorithm, I can't say what aspects of the music are anaylzed, but the theory is sound. These patterns are quanticized to generate the fingerprint. Since the patterns used occur only in sounds detectable by humans, these patterns must be preserved by compression, or the compression will not be accurate enough representation of the music to be useful.
  • It's worth noting the differences between "watermark" and "fingerprint". A watermark is embedded within the bitstream, presumably inaudibly (or invisibly for an image). A fingerprint is like an MD5 hash, it takes the audio and generates a signature or fingerprint which identifies the track.

    This technology is for generating fingerprints. i.e. it doesn't embed anything within the file. So it can't be used for tracking who ripped what CD.

    They use something very similar to this at least here in the UK to generate radio airplay data. An automated system is fed the output of every monitored radio station, and recognises what songs are played by each.

  • by interiot ( 50685 ) on Monday August 28, 2000 @03:23AM (#823709) Homepage
    From their FAQ [tuneprint.com]:
    • The fingerprint shouldn't change even if the music is made louder or softer, re-equalized slightly, passed through a mp3 encoder, speeded up or slowed down a little, and so on. Anything that doesn't change way the music sounds shouldn't change the fingerprint, and it should be impossible for even a smart, well-funded human being to make the fingerprint change without distorting the music.
    (emphasis mine)

    That sounds pretty reasonable and possible to me.

  • by tqbf ( 59350 ) on Monday August 28, 2000 @03:31AM (#823710) Homepage
    As someone else here said, this is conceptually not much different than calculating a message digest. In fact, I'm sure that's exactly what they are doing (and hopefully with a standard digest function, like SHA or MD5). Obviously, the question is what data they are feeding to the digest function. They obviously aren't feeding raw audio data, because it varies heavily between different codecs and sources. So, clearly, they're doing some kind of analysis. The easiest thing to develop is an algorithm for summarizing raw audio data. This addresses the concerns about encoding MP3s, Vorbis, or whatever --- you simply operate on the decoded results of these files. The goal of summarizing should be to come up with a description of audio data that is the same for two identical-sounding files.

    So the question then becomes how you "summarize" raw audio data so that 10 different sources/ decodings of the same piece of audio result in the same summary information.

    One pretty obvious thing to do is to select frequencies, set a threshold value (relative to the average amplitudes in the audio data for the frequencies you are analyzing) for "peak" amplitude at those frequencies, and measure time deltas between peaks. You can synchronize different audio samples to a recognizable pattern of peaks to get time synch, and you can measure time in quarter-second chunks to be "fuzzy".

    The raw data that you digest would then just be a series of peak-to-peak time deltas for each frequency, which should be consistant between recordings (even if you tack dummy data to the beginning and end of the file --- the latter problem being solved by only accounting for a fixed amount of time in each audio file). Think of it as summarizing/fingerprinting the audio data based on the images displayed in your MP3 player's spectrum analyzer.

    I'm not sure if what I've described is practical; it's the first thing I came up with when I was presented with the same problem awhile ago. But it's evidence, I hope, to an important fact:

    Anything your ears can do, a computer can do better.

  • by johnhebert ( 53732 ) on Monday August 28, 2000 @01:34AM (#823720)
    Though Tuneprints efforts are still pretty much alpha at this point, the idea is to derive a database of fingerprints, or signatures, of music tracks using a hidden algorithm. The use of the term "fingerprint" is kinda misleading as to how it works, though I'm sure it is unintentional.

    The problem of course, is that all pop music would have the same signature, since it pretty much sounds all the same anyway... :)

    The signatures are not added as metadata to the songs, though I guess they could be. They are kept in a separate database that is near the analyzing portion of the solution where the results can be queried.

    This is an interesting idea. I proposed something pretty similar to my co-workers a few months back when we were looking for a means of uniquely identifying recorded music, but I only received funny looks. Damn me and my laziness! :)

    I think it may be pretty difficult to get this solution working well, considering that songs can contain samples/riffs from other songs and many other factors, etc. I think the minimum length of the analyzation sample would have to be fairly long, relative to the size of the song in order to get an accurate signature.

  • This is obviously a hoax. Already thought so after reading this: The fingerprint doesn't change even if the sound is compressed, converted to a different file format, broadcast over the radio, and so on

    And after reading their homepage, there can not be any doubt left. The domain is also registered just 3 weeks ago [networksolutions.com]

    It is funny though...

  • by Anonymous Coward
    The technique they are using actually dates back thousands of years. Using an intrinsic property of the music known technically as the "melody", it's possible to recognize what a particular piece of music "sounds like". There are written records of people having this ability dating back as far as the renaissance.

A complex system that works is invariably found to have evolved from a simple system that works.

Working...