Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
United Kingdom AI

Creatives Demand AI Comes Clean On What It's Scraping 18

Over 400 prominent UK media and arts figures -- including Paul McCartney, Elton John, and Ian McKellen -- have urged the prime minister to support an amendment to the Data Bill that would require AI companies to disclose which copyrighted works they use for training. The Register reports: The UK government proposes to allow exceptions to copyright rules in the case of text and data mining needed for AI training, with an opt-out option for content producers. "Government amendments requiring an economic impact assessment and reports on the feasibility of an 'opt-out' copyright regime and transparency requirements do not meet the moment, but simply leave creators open to years of copyright theft," the letter says.

The group -- which also includes Kate Bush, Robbie Williams, Tom Stoppard, and Russell T Davies -- said the amendments tabled for the Lords debate would create a requirement for AI firms to tell copyright owners which individual works they have ingested. "Copyright law is not broken, but you can't enforce the law if you can't see the crime taking place. Transparency requirements would make the risk of infringement too great for AI firms to continue to break the law," the letter states.
Baroness Kidron, who proposed the amendment, said: "How AI is developed and who it benefits are two of the most important questions of our time. The UK creative industries reflect our national stories, drive tourism, create wealth for the nation, and provide 2.4 million jobs across our four nations. They must not be sacrificed to the interests of a handful of US tech companies." Baroness Kidron added: "The UK is in a unique position to take its place as a global player in the international AI supply chain, but to grasp that opportunity requires the transparency provided for in my amendments, which are essential to create a vibrant licensing market."

The letter was also signed by a number of media organizations, including the Financial Times, the Daily Mail, and the National Union of Journalists.

Creatives Demand AI Comes Clean On What It's Scraping

Comments Filter:
  • Okay. (Score:5, Informative)

    by Mr. Dollar Ton ( 5495648 ) on Tuesday May 13, 2025 @01:44AM (#65372415)

    Here it is: it SCRAPES EVERYTHING!11!!!

    • by evanh ( 627108 )

      What is needed is for those LLM company's to state it as such. As in, "Yes, we're knowingly stealing everything."

      • When they think they've offset the risks, they will even boast about it - "we have stolen everything the humanity has done and more, buy our electric brain".

  • Lost count (Score:5, Interesting)

    by ve3oat ( 884827 ) on Tuesday May 13, 2025 @02:02AM (#65372431) Homepage
    I have lost count of the number of times my own website has been scraped, sometimes by the big names and sometimes by bots that I didn't know existed. It's not just the html content that they scrape but often all of my original graphics (charts and diagrams) too. Of course my site is covered by copyright (Creative Commons, Attribution, Non-commercial, and Share-Alike 4.0) but I doubt that ChatGPT will ever tell you that this got this or that piece of information from me or my website. Some say that "search is dying" and I see some evidence of this in the falling number of organic visitors to my site. Maybe I should just take down my site, save a pile of money in domain registration and hosting fees, and move on to some other, more satisfying activity.
    • Just make the site text only with simple html and with a lot of hidden metadata. Any images with alt text should have texts like 'tricky dick', 'shaved kitten' and other improper descriptions.

      Add some throttling too that gets worse the more requesys that are made. Http redirects can also be fun to toy with, just redirect to a random AI source making it eat itself.

  • Muccah and his dinoaurs friends have profited off society for decades. They can demand to know all they want, no company should be required to expose their trade secrets to them. It's an absurd idea that you'd have to inform some demented boomer which book you fed to a computer program on your machine. And an absurd precedent: next will be college book authors suing you for using their knowledge against their license. Or self-help authors suing you for giving unlicensed advice to a friend.

    Copyright is a lim

    • Re:Go away (Score:4, Interesting)

      by martin-boundary ( 547041 ) on Tuesday May 13, 2025 @02:33AM (#65372457)
      Copyright is no such thing. Maybe in America, maybe a hundred years ago. This story is about UK copyright. Completely different purpose, completely different laws, completely different copyright.

      As to the idea that it should be ridiculous to demand what companies are doing internally, see also: taxes, income declarations, VAT collection, etc.

    • You mean theft secrets.

      It's not just about musicians. It's also about everyday Joe and small companies who have their IP stolen without consent or compensation. It's predatory and parasitic.

      • You mean theft secrets.

        It's not just about musicians. It's also about everyday Joe and small companies who have their IP stolen without consent or compensation. It's predatory and parasitic.

        It's not theft. Nothing was taken. Everything is still with the original owner.

        At least that's what we're told when someone steals music, movies, or software.

  • It's such an odd thing to be upset by, honestly. Like screaming into the void, "I want to be forgotten."

    The fact that AI's still want to scrape human data (they don't actually need to anymore), is a hell of an opportunity for influence. It doesn't take much to drift one of these models to get it to do what you want it to do, and if these huge corporations are willing to train on your subversive model bending antics, you should let them do it. We'll only get more interesting models out of it.

    I get it though.

  • When you start stopping real people from also using your material to train then I think they may have a point, as it is they are just scared they are being made redundant (justifiably so I guess) and wanting to discriminate. If you want your stuff out their to listen to and consume you are going to have to accept that also means AI will potentially consume.
  • ... then flounce off stage swishing their Ostrich boas and stamping their cuban heels. Send in the lawyers.

  • The British initiative has some good points. It's a bit strange, that many parts of society has accepted widespread copyright infringement and ignorance of /robots.txt when done by corporations to feed their generative (or plagiarising, depending on your point of view) models. While file sharing, even small-scale, was given harsher penalties, IP blocking and law changes to help mostly the large corporations.

    It would probably be much better the other way round - that some regulated personal non-profit sharin

      • by pereric ( 528017 )

        Well, maybe do some machine learning data injection (which is a quite available branch of attacks) in material scraped without permission to make their generative models output images of Tianmen square, Winnie the Pooh or maps with neither Taiwan nor Tibet :-)?

        Also, as they at least on paper claim to follow international treaties, it could be a more reasonable case for applying some trade policy.

  • The front page is almost entirely stories about AI. I get it. AI makes line go up. Stories about AI makes line go up.

    But, other things happen in IT and geek culture these days that don't involve line-goes-up at all.

  • by devslash0 ( 4203435 ) on Tuesday May 13, 2025 @05:46AM (#65372743)

    It's too late now. AI has already consumed and sucked out value out of any available source. Trying to enforce your rights now is a futile attempt. After all, how do you prove with certainty what the training set consisted of?

    AI companies should have been forced to seek legal written consent before scrapping any data.

    Instead, they have followed the good ol' maxims of "it's better to seek forgiveness than permission" and "damages cost less than consent".

"It's a dog-eat-dog world out there, and I'm wearing Milkbone underware." -- Norm, from _Cheers_

Working...