Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
The Almighty Buck The Internet The Media

Web Copyright Crackdown On the Way 224

Hugh Pickens writes "Journalist Alan D. Mutter reports on his blog 'Reflections of a Newsosaur' that a coalition of traditional and digital publishers is launching the first-ever concerted crackdown on copyright pirates on the Web. Initially targeting violators who use large numbers of intact articles, the first offending sites to be targeted will be those using 80% or more of copyrighted stories more than 10 times per month. In the first stage of a multi-step process, online publishers identified by Silicon Valley startup Attributor will be sent a letter informing them of the violations and urging them to enter into license agreements with the publishers whose content appears on their sites. In the second stage Attributor will ask hosting services to take down pirate sites. 'We are not going after past damages' from sites running unauthorized content says Jim Pitkow, the chief executive of Attributor. The emphasis, Pitkow says is 'to engage with publishers to bring them into compliance' by getting them to agree to pay license fees to copyright holders in the future. Offshore sites will not be immune from the crackdown: almost all of them depend on banner ads served by US-based services, and the DMCA requires the ad service to act against any violator. Attributor says it can interdict the revenue lifeline at any offending site in the world." One possible weakness in Attributor's business plan, unless they intend to violate the robots.txt convention: they find violators by crawling the Web.
This discussion has been archived. No new comments can be posted.

Web Copyright Crackdown On the Way

Comments Filter:
  • Re:Robots.txt (Score:1, Informative)

    by Anonymous Coward on Friday March 05, 2010 @10:55AM (#31370938)

    Seriously. Following robots.txt is not law, only convention.

    Unauthorised access to a computer system isn't against the law? Which country are you talking about? robots.txt is the standard method to express that certain access methods are not authorised.

  • by fuzzyfuzzyfungus ( 1223518 ) on Friday March 05, 2010 @11:17AM (#31371232) Journal
    Since, as you say, robots.txt will likely do nothing against them, the bigger question becomes "how do they plan to do their crawling?". Crawling from a well defined IP block, using software with user agent Attributor_copy_cop, will be laughably simple to block or present false noninfringing content to.

    Spoofing the UA strings and(if necessary) some of the behavior of common web browsers is a simple software problem, so I assume that they'll do that(unless they are terminally incompetent). Out of curiosity, though, does anybody know how easy and cheap it would be (using legitimate methods not botnet style stuff) for such a commercial entity to obtain a reasonably large number of, ideally "residential looking", IPs that change fairly often? Do you just call verizon and say "I want 500 residential DSL lines brought out to so-and-so location"? Would you obtain the services of one of the sleazy datacenter operators who caters to spammers and the like and knows how to switch IP blocks frequently? Do you pay to have second lines installed at your employee's houses, with company scanner boxes attached?
  • by yourlord ( 473099 ) on Friday March 05, 2010 @11:29AM (#31371388) Homepage

    I welcome them to crawl my sites and ignore my robots.txt files. They won't get very far though. When my server detects that behavior it passes the IP to my firewall which adds it to the "drop these packets into a black hole" list.

    I have quite a large table of IP addresses of idiots that violated robots.txt.

  • Re:Robots.txt (Score:3, Informative)

    by mea37 ( 1201159 ) on Friday March 05, 2010 @11:46AM (#31371602)

    Really?

    Do you also believe that ToS violations constitute unauthorized access to a computer? That approach was tried recently by the U.S. prosecutors [cnet.com]. Ultimately the court didn't buy that position.

    So... why would robots.txt, which advises me of your wishes but to which I never actually agree, carry any more legal authority than a ToS document to which I do supposedly agree as a condition of using your system?

  • by bcrowell ( 177657 ) on Friday March 05, 2010 @11:47AM (#31371610) Homepage
    I've had an experience with Attributor myself, and it's given me a pretty low opinion of them. I'm the author of a CC-BY-SA-licensed calculus textbook, titled "Calculus." Someone posted a copy of the pdf on Scribd, as allowed by the license. So one day I got an email from one of the people who runs Scribd, saying that Attributor had sent them a takedown notice, which they were skeptical about. Attributor hadn't supplied any useful information about what they thought was a violation. I called Scribd, and they checked and said it was a mistake -- they were working for Macmillan, which publishes another book titled "Calculus." So here they were, serving a DMCA notice under penality of perjury, and they hadn't even checked whether the name of the author was the same, or whether any of the text was the same. Their bot just found that the title, "Calculus," was the same as the title of one of their client's books. Pretty scummy.
  • Re:Robots.txt (Score:2, Informative)

    by DaTroof ( 678806 ) on Friday March 05, 2010 @11:53AM (#31371688)

    after purchasing a licence to use the search engine's data, naturally :)

    Depending on the search engine and its terms of service, they might not even need to purchase a license. Google, Bing, and Yahoo all provide search APIs for third-party software.

  • by bcrowell ( 177657 ) on Friday March 05, 2010 @11:59AM (#31371784) Homepage
    Oops, important correction to the parent post: "I called Attributor, and they checked and said it was a mistake -- they were working for Macmillan..."
  • by cpghost ( 719344 ) on Friday March 05, 2010 @12:11PM (#31371934) Homepage
    According to this [wikipedia.org], only Australia, Canada, USA, EU, Japan, South Korea, Mexico, Morocco, New Zealand, Singapore and Switzerland are currently part of that treaty. This (currently) leaves more than enough room for a whole lot of other countries (some of them as big as Russia and China) that are not part of it.
  • by Anonymous Coward on Friday March 05, 2010 @12:41PM (#31372322)

    Disclaimer: a Slashdot forum discussion is no substitute for professional legal advice; seek professional advice if you need it.

    To be a valid 17 USC 512 (c) takedown notice, it has to clearly identify the infringing content, i.e. with a link. If it doesn't, that's not a takedown, it's just an angry email.

    Also, it does require there to be a 'good-faith belief' that the material in question is infringing (i.e. that is not the perjury part), and a statement on penalty of perjury that the distribution of the material they are purporting it to be is not authorised by the copyright holder (that one could get some of the agents behind the recent takedowns of music blogs in hot water, because some of those definitely have been authorised by the copyright holders), and that they are either the copyright holder or the copyright holder's appointed agent.

    They should NEVER be sending out takedowns based on the whim of a bot with no human oversight; that represents overt negligence, not a good-faith belief, and fraud on their part as they claim to the copyright holders that they never do this.

    Please post the DMCA takedown in question to Chilling Effects, and contact Macmillan directly to inform them of the unfounded, mistaken threats Attributor are making in their name.

    I don't know about you, but we send out an invoice for costs for every false DMCA takedown we receive (so far they have all been misidentifications, some of which are repeated misidentifications, and we have contacted the copyright holder of that particular work directly - as a result, they are no longer using the services of Mark Ishikawa's "BayTSP"). Legislation in my country may change to reflect this, although much of the rest of it is going to go down the pan I don't doubt.

  • Re:Robots.txt (Score:3, Informative)

    by Crudely_Indecent ( 739699 ) on Friday March 05, 2010 @01:38PM (#31372980) Journal

    Anyone interested in finding out what's really going on with a website would look at robots.txt first and ask themselves 'now why do they want the robots to avoid these pages?'

    Of course, some of those entries will be dead-ends (dynamic pages that make no sense to crawl, password protected pages that would detract from a sites rankings, etc...).

    What's going to be interesting is what happens when their method is identified and/or the IP addresses they're using to make those identifications. There is no way to bypass .htaccess type restrictions. If their bot identifies itself (or can be identified), or their IP (range(s)) can be identified, the site owners can become invisible to the copyright bot and/or the agency tasked with detecting violations.

    A clever administrator might even build a script to deliver alternate content to the bot/agency so as to not appear suspiciously invisible.

    The exact method to thwart their efforts hinges on exactly how they detect the violations. It can definitely be done.

Intel CPUs are not defective, they just act that way. -- Henry Spencer

Working...