Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AI The Media

Newspapers Want Payment for Articles Used to Power ChatGPT (msn.com) 151

An anonymous reader shared this report from the Washington Post: For years, tech companies like Open AI have freely used news stories to build data sets that teach their machines how to recognize and respond fluently to human queries about the world. But as the quest to develop cutting-edge AI models has grown increasingly frenzied, newspaper publishers and other data owners are demanding a share of the potentially massive market for generative AI, which is projected to reach to $1.3 trillion by 2032, according to Bloomberg Intelligence.

Since August, at least 535 news organizations — including the New York Times, Reuters and The Washington Post — have installed a blocker that prevents their content from being collected and used to train ChatGPT. Now, discussions are focused on paying publishers so the chatbot can surface links to individual news stories in its responses, a development that would benefit the newspapers in two ways: by providing direct payment and by potentially increasing traffic to their websites. In July, Open AI cut a deal to license content from the Associated Press as training data for its AI models. The current talks also have addressed that idea, according to two people familiar with the talks who spoke on the condition of anonymity to discuss sensitive matters, but have concentrated more on showing stories in ChatGPT responses.

Other sources of useful data are also looking for leverage. Reddit, the popular social message board, has met with top generative AI companies about being paid for its data, according to a person familiar with the matter, speaking on the condition of anonymity to discuss private negotiations. If a deal can't be reached, Reddit is considering blocking search crawlers from Google and Bing, which would prevent the forum from being discovered in searches and reduce the number of visitors to the site. But the company believes the trade-off would be worth it, the person said, adding: "Reddit can survive without search."

"The moves mark a growing sense of urgency and uncertainty about who profits from online information," the article argues. "With generative AI poised to transform how users interact with the internet, many publishers and other companies see fair payment for their data as an existential issue."

They also cite James Grimmelmann, a professor of digital and information law at Cornell University, who suggests Open AI's decision to negotiate "may reflect a desire to strike deals before courts have a chance weigh in on whether tech companies have a clear legal obligation to license — and pay for — content."
This discussion has been archived. No new comments can be posted.

Newspapers Want Payment for Articles Used to Power ChatGPT

Comments Filter:
  • by Potor ( 658520 ) <farker1@gmai l . com> on Sunday October 22, 2023 @11:43AM (#63943637) Journal
    I work in a niche field, and currently ChatGPT spits out platitudes and nonsense concerning it. If it ever gets better, it's because of my work, and my colleagues. Where do I get my cheque?
    • Re:Me too (Score:5, Insightful)

      by Wrath0fb0b ( 302444 ) on Sunday October 22, 2023 @12:47PM (#63943757)

      My children study things off the internet and incorporate it into their knowledge stored in their brains.

      Do they owe you a chunk of their future paycheck or is this restricted to inorganic computation only?

      • by Potor ( 658520 )
        Are your children building a business model based on the work of others?
        • Re: (Score:3, Insightful)

          by gwjgwj ( 727408 )
          Probably so. That is the point of education, isn't it?
        • EVERY business model with employees is based off the work of others. That's how businesses work, in general.

          • by Potor ( 658520 )
            Most businesses pay their workers, and for their resources. But still - his children are not using publications in the same way as ChatGPT.
            • They aren't? They are synthesizing the information into connections within a neural network and then using that network in the future.

              It's not identical, of course, but it's similar enough that appeals to treating them differently need to be justified (IMHO).

        • Yes.
          “If I have seen further, it is by standing on the shoulders of giants.”

          https://en.wikipedia.org/wiki/... [wikipedia.org]
        • Are your children building a business model based on the work of others?

          In the simplest sense, isn't that what everyone does in some way or form?

        • Are your children building a business model based on the work of others?

          Isn't everyone's business model based off of the work of (at least one of) Newton, Watt, Kelvin, Otto, von Neumann, Turing, Alexander Flemming or Louis Pasteur?

          Like, isn't this the entire premise of civilization that everything we do, business and otherwise, is based on the work that other people did before. It's not like every generation is reinventing the wheel.

          • by Potor ( 658520 )
            This is tiresome - so many people (or bots?) making the same comment. How hard is it to grant that studying - especially children studying at school - is not a business model? Bad analogies are not clever.
    • Re:Me too (Score:4, Insightful)

      by Pinky's Brain ( 1158667 ) on Sunday October 22, 2023 @12:52PM (#63943765)

      Tell your boss to add footers to papers published by their researchers :

      "We don't consider the copying of this text to train language model fair use nor allowed when not approved explicitly by license. Copying is prima facie required in training and without license constitutes statutory infringement at the very least, regardless of whether the model and output is derivative. This is just a reminder and remains valid for all content which does not contain this text, except when explicitly licensed or in the public domain."

      Every big website should do this right the fuck now. Give the AI weenies no quarter.

      • by gwjgwj ( 727408 )
        Does it mean you are not allowing it to be read by humans? You know, by reading it you are training your language model.
        • Obviously, that's something that a court will have to decide. Do AI models count as persons, with similar rights, or are they just tools?
      • "We don't consider the copying of this text to train language model fair use

        I mean, that wouldn't be the boss' choice - that is decided in the courts, and whether it is a fair use or not is still up in the air. (This would also age like milk if it was ruled to be a fair use).

      • by bool2 ( 1782642 )

        "We don't consider ... this ... fair use."

        Fair use is a term defined in law so why does what *you* consider is or isn't fair use matter? Please explain why the copying that occurs when an AI reads something to train its neural network is different to the copying that occurs when a human reads something to train his?

    • Do you get paid for people reading your texts? If no, then probably not.
      • by Potor ( 658520 )
        People buy my books, as do libraries. Does that count?
      • If the AI reads the ads, does that count as paying to read the page?

        If we see blocks of text output of AI bots interspersed with mentions of car warranties and viagra, then it probably read the ads.
  • by xack ( 5304745 ) on Sunday October 22, 2023 @11:47AM (#63943653)
    Science fields that are open access friendly tend to make more progress than fields that are paywalled (and now AI-walled). Wikipedia's and Google's massive successes depended on easy crawling and open gates. It seems that newspapers wanr to back to the time before the printing press and the subsequent age of enlightement and industrial revolution that followed.
    • Sure but science (especially the subset that is open access) gets this privilege because it is funded via academia, grants and public spending subsidies. Science that is privately owned and for profit is nowhere near as "open".

      Journalism is in a precarious position as it is an important, nay vital part of the information pool the world uses but most are constrained by funding since (at least in the US and many parts of the world) these are private entities and they require revenue to operate and they don't

      • If it's not open then they don't have to worry about bots now do they? Reporters, investigators, editors, these people need to realize they don't get to control information after they've published it in an open fashion. They can assess their own business model and adapt accordingly. If they want to continue publishing in a fashion that enables the whole world to read their works they will just have to get over the fact that machines are part of that world and will consume it as well.
        • Bro we are on Slashdot, we should know the nuance here. This is the difference between "free as in speech" and "free as in beer".

          Just because in this case the reporting is "open" for readers like you and me does not mean it's open for other for-profit entities to take it and re-package and re-sell it with missing attribution and no flow of funds back to keep that root source of information going. Do Google and OpenAI have on the ground reporters in Gaza right now getting the stories? No? Well then why shou

  • Already paid (Score:5, Insightful)

    by sosume ( 680416 ) on Sunday October 22, 2023 @11:48AM (#63943655) Journal

    The news has already been paid for by the ads that were served while the AI was training on the content. The news sites should have included restrictions against AI in their TOS but in the end it's all water under the bridge, the cat's out of the hat.

    • The news has already been paid for by the ads that were served while the AI was training on the content. The news sites should have included restrictions against AI in their TOS but in the end it's all water under the bridge, the cat's out of the hat.

      Not in the least. Newspapers own the copyright on the articles they publish and they don't have to put anything in their terms of service to enforce it. When you visit a newspaper website you pay for access to the articles, either by (theoretically) watching ads or by paying for a subscription. That access does not include permission to copy the articles and republish them. A search engine has some leeway under fair-use exceptions that allows them to quote portions of text. They do not have the legal r

      • Copyright comes with fair use. Reading an article that is openly published is not a copyright violation. If I read a newspaper in a cafe while enjoying a coffee I'm not committing a copyright violation.
        • Re:Already paid (Score:4, Insightful)

          by cstacy ( 534252 ) on Sunday October 22, 2023 @04:08PM (#63944063)

          Reading an article that is openly published is not a copyright violation. If I read a newspaper in a cafe while enjoying a coffee I'm not committing a copyright violation.

          Because you are not "copying" it. The copy that your eyes and brain are making is not considered a "copy". If you scan that newspaper article into your laptop, that would be a Copyright violation. Even if you don't further "publish" it. The act of copying is a crime, that's a criminal violation, you are a criminal.

          And that's what the "AI" trainers are doing. A copy is necessarily made -- the article is scanned in.

          The fact that they proceed to put the copy into some kind of bender and pour it out later doesn't matter. Except insofar as copyrightable fragments can be identified. That would be the act of making additional copies, which would be additional crimes, each time it was output to a user. But ChatGPT doesn't output recognizable fragments. It's the initial copy made for training that is the crime.

          The "AI" trainers are trying to claim certain exceptions to the law, based on the nature of the output. But they don't really hold up. You can't get around the fact that the training copy was unauthorized.

          The "AI" companies are in for some trouble; but they have pulled this off very quickly already, and have unfathomable amounts of money and power behind them. It will be very difficult for society to achieve the obviously correct legal (and moral) outcome.

          The outcome is that every single bit of text that went into the AI training represents a Copyright violation of at least several thousand dollars. So for the billions of inputs, ChatGPT owes authors, collectively, perhaps trillions of dollars. (We're just talking statutory damages. If the authors can show other damages, the penalties go up.) That's civil. There are also potential criminal issues.

          Do you think the courts and governments and plaintiffs will figure out what to do with such a mess? And suppose that a few newspapers can make a settlement and licensing arrangement. Where does that leave the publishers of every book and article? And at the end of the day, will the authors ever be compensated, too, for this massive novel use of their work?

          When you do "disruptive" illegal things, do them as quickly and massively as possible. By the time society and law and government catch up with you, your power position will be overwhelming and everyone will accept the new normal. You won't pay for what you have done to people. There might be a tiny penalty, but in general the law will be adjusted for you. Your profits will be spectacular. The world will thank you for the ass raping and ask you for another, please.

          • Sorry but using your logic, everything you read on the Internet is a copyright violation. After all, the article needs to be copied into memory in order to display it on your screen. Copyright gives the owner certain privileges. However, it does not allow the holder to arbitrarily expand that list of privileges to include whatever they dream up.

            • In fact, that very issue got tied up in court for a while. The courts carved out a new fair use exemption because caching is necessary for the internet to work.
            • Sorry but using your logic, everything you read on the Internet is a copyright violation.

              As much as I agree with your position, the GP is right.

              Any electronic copy is technically a violation of the rightsholder's copyright(s) under the law. It's for this reason that all software licenses (especially proprietary ones) come with language that says something to the effect of "temporary copies made for the purposes of running / using the content in compliance with the provisions of this license are permitted." If they didn't, then simply installing or running the software would allow the rightsh

          • Because you are not "copying" it. The copy that your eyes and brain are making is not considered a "copy".

            The "AI" trainers are trying to claim certain exceptions to the law, based on the nature of the output. But they don't really hold up. You can't get around the fact that the training copy was unauthorized.

            What do you believe is the relevant legal difference? Where in copyright law are human beings differentiated from machine learning algorithms?

          • by Pieroxy ( 222434 )

            Scanning an article is not a copyright violation. The violation comes with what you do with your copy. If it's just to store privately, you're fine. If you plan on distributing it (commercially or not) then you're in trouble.

          • by AmiMoJo ( 196126 )

            There are issues about the output of these AI tools as well.

            There was a case in the UK a few years back where a photographer won a copyright infringement claim against someone for making a very similar photograph to his. It was of a red London bus crossing a bridge. The other person went to the same spot and used the same composition with a similar red bus. The court found that they had copied the essential elements of the photo.

            AI image generators that produce output which are very similar to work they wer

        • by jvkjvk ( 102057 )

          Fair use and reading a published article have little to to with each other. When reading, if you aren't making a copy yourself, YOU are not committing copyright violation. Unless it's one of those "contagion" laws where if you are in the chain anywhere you are in violation, but I don't think so.

      • Not in the least. Newspapers own the copyright on the articles they publish and they don't have to put anything in their terms of service to enforce it. When you visit a newspaper website you pay for access to the articles, either by (theoretically) watching ads or by paying for a subscription. That access does not include permission to copy the articles and republish them.

        It is not necessary to seek permission to remember facts and benefit from information presented in an article. Neither is there such a thing as copyright on facts.

        They do not have the legal right to copy and republish the entire article.

        They have all the right in the world to benefit from and produce transformative works based on information learned from an article.

        Newspapers have been selling access to their historical records for decades. This is nothing new to them or to the courts. If the AI engines want to build databases off published articles en-mass they are going to have to pay up.

        If I pay a know-it-all to sit down and answer my questions from memory they have no legal requirement to compensate their original sources or to get their permission to divulge information learned from articles the kn

        • It is not necessary to seek permission to remember facts and benefit from information presented in an article.

          Correct. Learning is not copying.

          Neither is there such a thing as copyright on facts.

          Correct. Copyright applies to text, not ideas. Lets consider a situation where you want to reference a dictionary definition. Dictionary definitions are copyright protected, but fair use generally allows quoting short sections of text. Lets assume the definition is three paragraphs long, so you aren't sure you can copy it verbatim. If you restate the definition you are not copying, but creating a new product.

          They have all the right in the world to benefit from and produce transformative works based on information learned from an article.

          Correct. Learning is not copying.

          If I pay a know-it-all to sit down and answer my questions from memory they have no legal requirement to compensate their original sources or to get their permission to divulge information learned from articles the know-it-all has previously read.

          Correct. Learning is not copy

          • by cowdung ( 702933 )

            "Learning" in the sense of Machine Learning is transformative, not derivative.

            In fact, as others have stated, GPT's main value is it's ability to "understand" things, rather than its ability to recall facts or quotes from articles or books.

            • GPT's main value is it's ability to "understand" things, rather than its ability to recall facts or quotes from articles or books.

              Those who believe a large language model is capable of understanding are mistaken. Someone shared this link [stephenwolfram.com] to an article by Stephen Wolfram last week. It's a good quick overview of how large language models work.

    • The news has already been paid for by the ads that were served while the AI was training on the content. The news sites should have included restrictions against AI in their TOS but in the end it's all water under the bridge, the cat's out of the hat.

      Very true.

      The problem with News outlets wanting AI generated content paid for is that the AI content isn't worth shit. Do the News outlets want credit and blame for that?

      It's just algorithmically generated quasi content that reads like a 9th grader trying to come up with a 1000 word essay by padding out 100 words with 90 percent meaningless puff.

      I gave a try at this incredible and disruptive technology that was going to eliminate almost everyone's jerbs, and after laughing a bit at the inanity, it w

      • by Ksevio ( 865461 )

        I love how every time there are these articles about ChatGPT someone always posts a snarky reply about how THEY tried it and it was no where near good enough to be useful (and gosh, just so slow!),

        Meanwhile, the next generation of workers are already learning how to use it effectively and improving their productivity.

        • I love how every time there are these articles about ChatGPT someone always posts a snarky reply about how THEY tried it and it was no where near good enough to be useful (and gosh, just so slow!),

          Some folks get a bit triggered whe their ox gets gored, eh?

          Meanwhile, the next generation of workers are already learning how to use it effectively and improving their productivity.

          Okay, you being an expert on ChatGPT, I'll ask you to educate me in the error of my ways. You should be easily able to provide us all with their successes, and references to them. Thanks for letting us know. None of us want to be saying the wrong things, so it is your task to show us it's success.

          • Hows these apples. Your both talking nonsense.

            ChatGPT *IS* producing SOME high value work. Its not perfect, some of it is absolute gibberish, but its a technology thats barely a year old building on a theory thats barely 6 years old. Things are moving extremely fast.

            But precisely because the technology is producing high value work is why its a net loss for humanity. Its high value, but its not human value. The internet is clogging up with machine generated drivel thats well written enough that its replacing

            • FWIW, I'm already using chatGPT to delay hiring another person in my solo practice. When I do hire, I will expect the person to use chatgpt to handle the noise level work like simple coding and business emails.

              ----

              The future arrives too soon and in the wrong order. Capitalism has been shedding workers through technology, Financial engineering and exploiting cost differentials around the globe. LLM systems are only accelerating the rate at which capitalism sheds workers.

              The natural outcome is that we will ha

          • Here are four ways I use ChatGPT successfully.

            1. Summarizing articles: I don't have time to read. If the some reason interesting, I move on. If it is, I go to the original material. Saves me a lot of time.
            2. Generating a skeleton for writing: There is an awful lot of boilerplate in writing and chat GPT can crank it out like nobody's business and I can tell what to keep, what to toss, and what to change. ChatGPT is more consistent than I am through a body of work but it also lets me capture my thoughts, and

      • by cstacy ( 534252 )

        The problem with News outlets wanting AI generated content paid for is that the AI content isn't worth shit. Do the News outlets want credit and blame for that?>

        They don't want "credit". They want to be paid each time their material is ingested ("copied" is the legal term) into the "AI". (They don't care how it is used after that, unless recognizable literal fragments are going to be output - which would be additional "copying".)

        The "AI" company illegally made copies of the newspapers. The newspapers want their statutory (or other) damages as specified by the Copyright law. Going foward after that, the newspapers probably will offer the "AI" company a license for l

        • The problem with News outlets wanting AI generated content paid for is that the AI content isn't worth shit. Do the News outlets want credit and blame for that?>

          They don't want "credit". They want to be paid each time their material is ingested ("copied" is the legal term) into the "AI".

          You know what I mean. Anyhow, as I noted before, I'm certain that other groups will provide their own news for free, so the US news outlets will be able to breathe easily.

          Think maybe we should re-visit the old idea that posting links on the internet should be made illegal?

      • by hawk ( 1151 )

        >It's just algorithmically generated quasi content that reads like a 9th
        >grader trying to come up with a 1000 word essay by padding out
        >100 words with 90 percent meaningless puff.

        Wait, I thought that was the New Yorker.

        Oh, I see now. There's an about a two grade level difference . . .

        hawk

        • >It's just algorithmically generated quasi content that reads like a 9th >grader trying to come up with a 1000 word essay by padding out >100 words with 90 percent meaningless puff.

          Wait, I thought that was the New Yorker.

          Oh, I see now. There's an about a two grade level difference . . .

          hawk

          Boom! 8^)

    • TOS dont' mean anything unless the content is walled and requires explicit agreement with those terms to access it. You don't automatically agree to some obscure terms posted on a different url to consume the url you are on.
    • You think the cats are out of the hat? Just wait till they get to the VOOM!

      How that VOOM made us ZOOM...

  • What if (Score:5, Insightful)

    by Valgrus Thunderaxe ( 8769977 ) on Sunday October 22, 2023 @11:48AM (#63943659)
    I read their articles (or in the case of books) and used this information to become successful and make money? Do I owe the publishers or authors a royalty? How about the textbook publishers or my professors?
    • by cstacy ( 534252 )

      I read their articles (or in the case of books) and used this information to become successful and make money? Do I owe the publishers or authors a royalty?

      You don't owe anything because you didn't make any copy. Reading into your brain is not making a copy. Scanning an article into a computer is making a copy, and there are ("Copyright") laws about doing that. The scanning that OpenAI did was not authorized under the law. It was an illegal copying.

      The owner of the copyright doesn't have to prove damages to get compensated for illegal copying; just prove that a copy was made, For each article,the statutory damage is up to $30,000. There are also additional re

  • by Asynchronously ( 7341348 ) on Sunday October 22, 2023 @11:55AM (#63943671)

    News is no longer news, it's propaganda.

    • You're right. Over the last ten to twenty years the mainstream media has morphed from mostly reliable information into propaganda and government sponsored mis and disinformation and outright censorship. We are probably better off with them not being trained on that. Of course, training them on social media is likely ten times worse.
      • Did it morph though? It really makes you wonder how long have they been lying and getting away with it without the internet and free flow of information to catch them? Was it ever truly unbiased and factual reporting or was it all propaganda from a tightly controlled information broadcast system?
  • I also want some more money for being disabled compared to AI. Pay up, bitches!

  • "Its" data? (Score:5, Insightful)

    by Rosco P. Coltrane ( 209368 ) on Sunday October 22, 2023 @12:09PM (#63943689)

    Reddit, the popular social message board, has met with top generative AI companies about being paid for its data

    At what point did Reddit pay the redditors anything to create "its" data?

    Oh yeah that's right: they gave it all away for free in exchange for karma points...

    • by cstacy ( 534252 )

      Reddit, the popular social message board, has met with top generative AI companies about being paid for its data

      At what point did Reddit pay the redditors anything to create "its" data?

      Oh yeah that's right: they gave it all away for free in exchange for karma points...

      Well, yes they did and it's all perfectly legal, and Reddit probably does have a Copyright claim.

      Each of your posts might have been worth $750 to you, had you not just given it away. When Reddit wins all this money, I wonder if users will sue Reddit. The answer so that question is complicated, but the part about Reddit successfully suing OpenAI is much more straightforward.

  • by DrMrLordX ( 559371 ) on Sunday October 22, 2023 @12:09PM (#63943691)

    That's all the news publishers can really hope to charge a generative AI company for using its content. It's available to every other reader for a few dollars per year. Why should a generative AI cost any more when it reads the same articles?

    Of course, the company may need to acquire a number of subscriptions but that wouldn't be more than a few thousands dollars per year for nearly every paper out there. Or less if they just cut out the middleman and got stories directly from AP and Reuters.

    • by isomer1 ( 749303 )
      No, no, no. That's now how LLMs work.

      First - we already have precedence: the fee for a library copy is already significantly more than a single-user copy for traditional printed media. Those library copies are modeled on a multiple, but limited usage. In the case of LLMs the usage is vastly larger than the a traditional library and deserves to be negotiated higher.

      Second - LLMs aren't just 'reading' the material, they are republishing (indirectly) for *every single response they generate*, and here's th
      • If a researcher cites a body of AP articles as a source for a published paper that appears on multiple journals and is read by thousands, does that researcher owe more money to the AP than a regular subscriber?

        • by cstacy ( 534252 )

          If a researcher cites a body of AP articles as a source for a published paper that appears on multiple journals and is read by thousands, does that researcher owe more money to the AP than a regular subscriber?

          OpenAI did not "cite" anything. They made a copy of the material -- they "scanned it in" to their model. The analogy would be that your scientific paper literally copied and included significant (or whole) parts of the original material. And for commercial use, the case is even more strict.

          If you ever published anything at all, presumably you would comprehend this. Is this not taught in the 7th grade when they start asking you to write homework papers and book reports?

      • >LLMs aren't just 'reading' the material, they are republishing (indirectly)
        the "indirectly" is VERY important: the word you are looking for is "transformative".

        • The word I'm looking at is "republish", they don't republish.
        • by tlhIngan ( 30335 )

          >LLMs aren't just 'reading' the material, they are republishing (indirectly)
          the "indirectly" is VERY important: the word you are looking for is "transformative".

          If that is true, then if I get ChatGPT to read in the Linux source code, and output something LInux-like, is that still GPL? Technically it should be, but because ChatGPT "transformed" it, it's no longer under copyright, making the GPL irrelevant.

          Which means if I want a Linux-like operating system without the hassles of the GPL, I can have a LLM

          • I've played around with getting ChatGPT to write C++ code: it won't write a high-quality OS for you, I'm afraid.
            Of course, just because a thousand monkeys manage to type out "The Godfather" doesn't mean you still can avoid copyright. Same as LLMs that generate images. "Accidentally" transforming something into a duplicate of a copyrighted work doesn't remove the copyright. Even a random letter function has the capability to generate copyrighted work.

        • by cstacy ( 534252 )

          LLMs aren't just 'reading' the material, they are republishing (indirectly)
          the "indirectly" is VERY important: the word you are looking for is "transformative".

          The mere act of scanning in the original material, before it goes into the blender, was most likely a Copyright violation. That would be the subject of enormous statutory damages, full stop.

          Since what comes out of the "AI" does not include literal fragments, the output is probably not a Copyright issue. There is no identifiable correspndance between the original article and the AI output, so that's not "transformative" in the usual sense. Plagiarism is not, strictly speaking, a crime. However, there are oth

      • You really don't understand how LLMs work at all. ChatGPT isn't a library. I can't get an entire copy of any book from it. Your library analogy is bunk. They are indeed reading and learning from the article. They are not republishing it. Do you think LLMs are the Holy Grail of data compression? Do you think OpenAI has compressed their petabytes dataset into a 500GB model? Because it needs to be said again. They are not republishing.
    • by cstacy ( 534252 )

      That's all the news publishers can really hope to charge a generative AI company for using its content. It's available to every other reader for a few dollars per year.

      Why should a generative AI cost any more when it reads the same articles?

      Because that's not how Copyright law works.
      The newspaper readers did not make unauthorized copies. But just because something is available (in a book store, on the web, in a public library) does not mean it may be freely copied.

      OpenAI made illegal copies.

      If one can show that a work (e.g. a newspaper article) was copied without legal authorization, the law automarically provides "statutory" damages. The amount per article is between $750 and $30,000.

      Not just newspaper articles. Every single bit of text that

  • SItes that have content have many methods of limiting who can access it.
    When they use paywalls and adblockerwalls they're just extorting would-be viewers.

    Just like Google and FB pulled out of several markets, and now Twitter is talking
    about pulling out of EU because of the DSA... the companies running these scapers,
    be they search engines or so-called AI have the same choice to make.

    If Big Media doesn't want its content viewership to go up, they are doing exactly
    the right thing. Search engines didn't go awa

  • Newspapers to get law passed so that a tax is placed on all comments using letters. You didn't think they'd just let you keep using them for free did you?

    Seriously, this has to be some of the dumbest shit ever.

    • I want everyone who made me feel bad executed also. Dumb enough for ya?

    • Newspapers to get law passed so that a tax is placed on all comments using letters. You didn't think they'd just let you keep using them for free did you?

      Seriously, this has to be some of the dumbest shit ever.

      I do know that if The US decides that AI generation must pay for it to be used, there are other countries like Russia, Iran, and China who will provide free AI content in it's place. What could go wrong?

      Consequences, consequences,

  • These are the same newspapers who did the Andrew Yang media blackout and not even one election cycle away, they are concerned about profit, look at capitalism making everyone scramble and claw for their share of the pie, it is almost embarrassing! Well no pity for you newspapers https://www.genolve.com/design... [genolve.com]
  • Once you put something in public, it's not really yours any more.
    Yes, non can tell it's is his: we have copyright for this.
    But if I use your public text for my own amusement, or use it to inspire myself for an opera or to train my AI engine, well, it's not your business. Also because you cannot prove my AI thing is actually using your text.

    • We put this article out for the entire world to read then also encouraged consumers to share it with everyone they know so we can get more exposure and more people will consume our article. Now we're pissed because the article is being consumed.
  • IANAL but I was taught many years ago that information can not be copyrighted. I believe it's still true.

    From US government copyright office [copyright.gov]
    Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed.

    So, unless the AI is spitting out text verbatim, there's no copyright infringement.
  • If commercial LLM operations won't turn a profit, why are those who did the work that is the only reason those LLMs have even *potential* value the ones who should not be paid when their works value is known and the work of operating the commercial LLM is speculative?

    I can easily believe software code or artworks created by those with decades of experience to achieve continually improved quality that gets used as data for LLMs up has a value that some idiot in a suit talking buzzwords at press conferences d

  • OpenAI doesn't use propaganda in its database.

  • For generating news, providing interviews, etc.

"I got everybody to pay up front...then I blew up their planet." "Now why didn't I think of that?" -- Post Bros. Comics

Working...