Can P2P Filter Copyrighted Content? 373
scubacuda writes "DRMwatch reports that technologists acting on behalf of porn publisher Titan Media reported to Congress that P2P networks could (if they wanted to) use "fingerprinting" (aka "hashing") to detect copyrighted works and then filter them with the "spyware" installed on all nodes in the network."
Didn't AudioGalaxy try this? (Score:5, Informative)
This *is* possible... (Score:5, Informative)
I used to work for a small company called Relatable (http://relatable.com/), which was working with Napster back in the day to identify the music being traded over the network.
Relatable's technology recognizes music by the acoustic properties of the audio itself regardless of how it was recorded, encoded, etc.
Obviously there are still ways around this, but it is a fairly solid solution.
It is important to recognize that "fingerprinting" does not equal "hashing". We all know that hashing will *not* work. But there are other techniques, at least for audio, that can work.
Josh
Hmm.. (Score:4, Informative)
However, for the average Kazaa user, it just might work. Most of them seem to think that if you uninstall kazaa your music is gone...or that you can't play the Kazaa music outside of the Kazaa client.
Keeping this in mind, then, we can give a little bit of credit to these guys in that they may succeed in fooling the idiots who use Kazaa.
Of course, people like that usually aren't the ones to come up with "original" content anyway.
Its actually amusing to think of the cat and mouse game this could develop into
Re:They'd Better Not (Score:4, Informative)
But trying to clarify that is like telling an internet user that a "cracker" broke into their computer, not a "hacker." (However, I'll note that the copyright legality clarification is probably more important than that of the cracker/hacker.)
Spyware (Score:5, Informative)
Personally, I've done 4 in 2 days. And I can tell you I'm so sick of it it's not even funny.
One was so screwed up the HOSTS file was infected with encrypted javascript. Took me 3 hours just to knock that bastard down to the point I could get explorer open in under 10 minutes.
Special thanks to everyone that fights it by writing those removers... god they are a lifesaver.
Dumb idea (Score:2, Informative)
Re:Doomed to fail. (Score:5, Informative)
You've looked at this too naively... Take around a hundred MD5s of nonoverlapping chunks of the file. If 90% of these match, you have near certainty that the files match except for exactly such tampering as you suggest.
For some files, you could get away with that. For others, particularly the highly compressed audio and video files that dominate P2P, breaking such a detection algorithm would, over time, introduce intolerable errors in the file (by the third or fourth copy, I'd say), since such changes would need to occur randomly or risk filtering by the detection algorithm V2.
Not to say we couldn't still get around such attempts to prevent downloading - Until they ban them, simply putting everything in a password-protected zip file (with the password included in a non-passworded file) would suffice for generating effectively random files (to a hash checker, anyway).
My point? Overall, this will just turn into yet another war of escalating circumventions and countermeasures, benefitting neither the content producers nor consumers.
Re:Doomed to fail. (Score:2, Informative)
No, it didn't. There are "hashing techniques" specially made for audio - "audio fingerprinting" so to speak, like Relatable [relatable.com]'s TRM [relatable.com] and Gracenote [gracenote.com]'s MusicID [gracenote.com] which do a great job of it. They identify the file correctly no matter what the source is - lossless audio CD, or even 128kbps MP3, you get the same fingerprint.
I've tried TRM personally through MusicBrainz [musicbrainz.org], and ran it on around 1000 of my MP3s, some of them really horrible quality, and it managed to identify 99% of them (TRM fingerprint correlated with actual metadata is stored at MusicBrainz). I was surprised, but yes, it did work. And this technology is rather old too, I'm surprised not too many people know about this.
And the article specifically mentions this fact: ...The experts' claims center on technology for detecting copyrighted works through "fingerprinting" (sometimes also called "hashing") technology that identifies songs by analyzing the content itself. Such technology, which is provided by several firms including Audible Magic, GraceNote, and MediaGuide...
Re:A DRM Parable (Score:5, Informative)
First they came for the Jews
And I did not speak out ?
Because I was not a Jew.
Then they came for the communists
And I did not speak out ?
Because I was not a communist.
Then they came for the trade unionists
And I did not speak out ?
Because I was not a trade unionist.
Then they came for me ?
And there was no-one left
To speak out for me.
P.S. It is an important reminder to stand for the rights of others, to stand for the rights of terrorists, murderers, child pornografers, P2P programmers, christian fundamentalists, and for the rights of everyone else. We may disagree with people, but only in a free and tolerant society can we expect to be safe ourselves.
Re:Considering the vast amounts involved... (Score:1, Informative)
There's volume not just in the number of songs that need fingerprints, but also the number of fingerprints per song. That's a number probably nearly as large as the range of hashes... You can edit the file in your favorite audio editor, rehash, and it's a different fingerprint entirely. Bend a wave form just slightly, and your ears can't tell the difference. But as far as the fingerprinting software is concerned, it's not the same file at all. If you also consider the use of different encoding/ripping algorithms and software, the number grows even higher since each program would have slightly different output and entirely different fingerprints...
Re:Hashes aren't unique (Score:2, Informative)
Bzzt! Thanks for playing. By definition a secure hash is one where it is computationally intractable to generate data which hashes to a particular (chosen) value.
Re:Doomed to fail. (Score:4, Informative)
Of course, you could choose to ignore the low bits, and fingerprint the upper bits, but this requires the software that trades files to be able to decode any type of file going over the network. This isn't feasible because it wouldn't be hard for someone to write a strongly encrypted proprietary wrapper on existing codecs which "garbages" the data, and distribute a free package which ungarbages it. Even if it was simple for Kazaa or other services to break this and include it in the software, it would not be legal for them to distribute the decryption with their software. If somehow it became legal, it would be simple for someone else to release a new one next week. And another new one the week after that.
The point is that this would start a tit-for-tat war. I guarantee any fingerprinting technique someone can think of, someone else can can defeat it with ease, and the concept of wrapping files in another program will put the highest volume copyright traders a few steps ahead of content filtering, ad nauseum.
Lower retail prices... there's a thought. (Score:2, Informative)
"copyrighted" isn't really the point (Score:2, Informative)
This is actually about copyrighted content that authors wish to control .. not "copyright" simply as such. That's why the Creative Commons Project [creativecommons.org] is so important.