Newspapers Want Payment for Articles Used to Power ChatGPT (msn.com) 151
An anonymous reader shared this report from the Washington Post:
For years, tech companies like Open AI have freely used news stories to build data sets that teach their machines how to recognize and respond fluently to human queries about the world. But as the quest to develop cutting-edge AI models has grown increasingly frenzied, newspaper publishers and other data owners are demanding a share of the potentially massive market for generative AI, which is projected to reach to $1.3 trillion by 2032, according to Bloomberg Intelligence.
Since August, at least 535 news organizations — including the New York Times, Reuters and The Washington Post — have installed a blocker that prevents their content from being collected and used to train ChatGPT. Now, discussions are focused on paying publishers so the chatbot can surface links to individual news stories in its responses, a development that would benefit the newspapers in two ways: by providing direct payment and by potentially increasing traffic to their websites. In July, Open AI cut a deal to license content from the Associated Press as training data for its AI models. The current talks also have addressed that idea, according to two people familiar with the talks who spoke on the condition of anonymity to discuss sensitive matters, but have concentrated more on showing stories in ChatGPT responses.
Other sources of useful data are also looking for leverage. Reddit, the popular social message board, has met with top generative AI companies about being paid for its data, according to a person familiar with the matter, speaking on the condition of anonymity to discuss private negotiations. If a deal can't be reached, Reddit is considering blocking search crawlers from Google and Bing, which would prevent the forum from being discovered in searches and reduce the number of visitors to the site. But the company believes the trade-off would be worth it, the person said, adding: "Reddit can survive without search."
"The moves mark a growing sense of urgency and uncertainty about who profits from online information," the article argues. "With generative AI poised to transform how users interact with the internet, many publishers and other companies see fair payment for their data as an existential issue."
They also cite James Grimmelmann, a professor of digital and information law at Cornell University, who suggests Open AI's decision to negotiate "may reflect a desire to strike deals before courts have a chance weigh in on whether tech companies have a clear legal obligation to license — and pay for — content."
Since August, at least 535 news organizations — including the New York Times, Reuters and The Washington Post — have installed a blocker that prevents their content from being collected and used to train ChatGPT. Now, discussions are focused on paying publishers so the chatbot can surface links to individual news stories in its responses, a development that would benefit the newspapers in two ways: by providing direct payment and by potentially increasing traffic to their websites. In July, Open AI cut a deal to license content from the Associated Press as training data for its AI models. The current talks also have addressed that idea, according to two people familiar with the talks who spoke on the condition of anonymity to discuss sensitive matters, but have concentrated more on showing stories in ChatGPT responses.
Other sources of useful data are also looking for leverage. Reddit, the popular social message board, has met with top generative AI companies about being paid for its data, according to a person familiar with the matter, speaking on the condition of anonymity to discuss private negotiations. If a deal can't be reached, Reddit is considering blocking search crawlers from Google and Bing, which would prevent the forum from being discovered in searches and reduce the number of visitors to the site. But the company believes the trade-off would be worth it, the person said, adding: "Reddit can survive without search."
"The moves mark a growing sense of urgency and uncertainty about who profits from online information," the article argues. "With generative AI poised to transform how users interact with the internet, many publishers and other companies see fair payment for their data as an existential issue."
They also cite James Grimmelmann, a professor of digital and information law at Cornell University, who suggests Open AI's decision to negotiate "may reflect a desire to strike deals before courts have a chance weigh in on whether tech companies have a clear legal obligation to license — and pay for — content."
Me too (Score:4)
Re:Me too (Score:5, Insightful)
My children study things off the internet and incorporate it into their knowledge stored in their brains.
Do they owe you a chunk of their future paycheck or is this restricted to inorganic computation only?
Re: (Score:2)
Re: (Score:3, Insightful)
Re: (Score:2)
EVERY business model with employees is based off the work of others. That's how businesses work, in general.
Re: (Score:3)
Re: (Score:2)
They aren't? They are synthesizing the information into connections within a neural network and then using that network in the future.
It's not identical, of course, but it's similar enough that appeals to treating them differently need to be justified (IMHO).
Re: (Score:2)
“If I have seen further, it is by standing on the shoulders of giants.”
https://en.wikipedia.org/wiki/... [wikipedia.org]
Re: (Score:2)
Are your children building a business model based on the work of others?
In the simplest sense, isn't that what everyone does in some way or form?
Re: (Score:2)
Isn't everyone's business model based off of the work of (at least one of) Newton, Watt, Kelvin, Otto, von Neumann, Turing, Alexander Flemming or Louis Pasteur?
Like, isn't this the entire premise of civilization that everything we do, business and otherwise, is based on the work that other people did before. It's not like every generation is reinventing the wheel.
Re: (Score:3)
Re:Me too (Score:4, Insightful)
Tell your boss to add footers to papers published by their researchers :
"We don't consider the copying of this text to train language model fair use nor allowed when not approved explicitly by license. Copying is prima facie required in training and without license constitutes statutory infringement at the very least, regardless of whether the model and output is derivative. This is just a reminder and remains valid for all content which does not contain this text, except when explicitly licensed or in the public domain."
Every big website should do this right the fuck now. Give the AI weenies no quarter.
Re: (Score:2)
Re: (Score:3)
Re: (Score:2)
"We don't consider the copying of this text to train language model fair use
I mean, that wouldn't be the boss' choice - that is decided in the courts, and whether it is a fair use or not is still up in the air. (This would also age like milk if it was ruled to be a fair use).
Re: (Score:2)
"We don't consider ... this ... fair use."
Fair use is a term defined in law so why does what *you* consider is or isn't fair use matter? Please explain why the copying that occurs when an AI reads something to train its neural network is different to the copying that occurs when a human reads something to train his?
Re: (Score:2)
Re: (Score:3)
Re: (Score:2)
If we see blocks of text output of AI bots interspersed with mentions of car warranties and viagra, then it probably read the ads.
Re:Me too (Score:5, Interesting)
"If it ever gets better, it's because of my work, and my colleagues." It's because its owners improved it.
ChatGPT improves nothing in terms of content. It simply repackages what's already known with its predictive algorithms, and is capable of adding nothing to our store of knowledge - because it knows nothing except what words statistically fall after one another. It produces as much nonsense as it produces shallow analyses, which it essentially scrapes this from published work. Without any attribution, I might add.
Moreover, in what other industry are resources free? Especially resources that were scarcely fairly remunerated in the first place (I assume you know the scam of academic publishing).
Do you publish your work? Who pays you to work in your "niche field"?
"Where do I get my cheque?" Why are you entitled? You didn't improve it.
I've published a good number books and articles. WIth regards to the books, I tend to get 8 percent of the net. Articles are always free, unless licensed for reproduction elsewhere. I am not complaining about the poor pay (as an academic, I also have a professor's salary - and no - my university does not provide merit pay for publishing) - I publish because it's my passion.
Re: (Score:2)
ChatGPT improves nothing in terms of content. It simply repackages what's already known with its predictive algorithms, and is capable of adding nothing to our store of knowledge
The point of LLM isn't what it knows it's what it groks. These systems are capable of generating new knowledge by applying that understanding. While one can quibble over how useful or intelligent these contraptions are the absolutist statement above is flat wrong.
Re: (Score:3)
These systems are capable of generating new knowledge
I'm curious, which new knowledge did it generate? I think it never happened but I won't mind some source.
Re: (Score:3)
Re: (Score:2)
Beware the FUTON effect (Score:5, Insightful)
Re: (Score:2)
Sure but science (especially the subset that is open access) gets this privilege because it is funded via academia, grants and public spending subsidies. Science that is privately owned and for profit is nowhere near as "open".
Journalism is in a precarious position as it is an important, nay vital part of the information pool the world uses but most are constrained by funding since (at least in the US and many parts of the world) these are private entities and they require revenue to operate and they don't
Re: (Score:3)
Re: (Score:2)
Bro we are on Slashdot, we should know the nuance here. This is the difference between "free as in speech" and "free as in beer".
Just because in this case the reporting is "open" for readers like you and me does not mean it's open for other for-profit entities to take it and re-package and re-sell it with missing attribution and no flow of funds back to keep that root source of information going. Do Google and OpenAI have on the ground reporters in Gaza right now getting the stories? No? Well then why shou
Already paid (Score:5, Insightful)
The news has already been paid for by the ads that were served while the AI was training on the content. The news sites should have included restrictions against AI in their TOS but in the end it's all water under the bridge, the cat's out of the hat.
Re: (Score:3)
The news has already been paid for by the ads that were served while the AI was training on the content. The news sites should have included restrictions against AI in their TOS but in the end it's all water under the bridge, the cat's out of the hat.
Not in the least. Newspapers own the copyright on the articles they publish and they don't have to put anything in their terms of service to enforce it. When you visit a newspaper website you pay for access to the articles, either by (theoretically) watching ads or by paying for a subscription. That access does not include permission to copy the articles and republish them. A search engine has some leeway under fair-use exceptions that allows them to quote portions of text. They do not have the legal r
Re: (Score:3)
Re:Already paid (Score:4, Insightful)
Reading an article that is openly published is not a copyright violation. If I read a newspaper in a cafe while enjoying a coffee I'm not committing a copyright violation.
Because you are not "copying" it. The copy that your eyes and brain are making is not considered a "copy". If you scan that newspaper article into your laptop, that would be a Copyright violation. Even if you don't further "publish" it. The act of copying is a crime, that's a criminal violation, you are a criminal.
And that's what the "AI" trainers are doing. A copy is necessarily made -- the article is scanned in.
The fact that they proceed to put the copy into some kind of bender and pour it out later doesn't matter. Except insofar as copyrightable fragments can be identified. That would be the act of making additional copies, which would be additional crimes, each time it was output to a user. But ChatGPT doesn't output recognizable fragments. It's the initial copy made for training that is the crime.
The "AI" trainers are trying to claim certain exceptions to the law, based on the nature of the output. But they don't really hold up. You can't get around the fact that the training copy was unauthorized.
The "AI" companies are in for some trouble; but they have pulled this off very quickly already, and have unfathomable amounts of money and power behind them. It will be very difficult for society to achieve the obviously correct legal (and moral) outcome.
The outcome is that every single bit of text that went into the AI training represents a Copyright violation of at least several thousand dollars. So for the billions of inputs, ChatGPT owes authors, collectively, perhaps trillions of dollars. (We're just talking statutory damages. If the authors can show other damages, the penalties go up.) That's civil. There are also potential criminal issues.
Do you think the courts and governments and plaintiffs will figure out what to do with such a mess? And suppose that a few newspapers can make a settlement and licensing arrangement. Where does that leave the publishers of every book and article? And at the end of the day, will the authors ever be compensated, too, for this massive novel use of their work?
When you do "disruptive" illegal things, do them as quickly and massively as possible. By the time society and law and government catch up with you, your power position will be overwhelming and everyone will accept the new normal. You won't pay for what you have done to people. There might be a tiny penalty, but in general the law will be adjusted for you. Your profits will be spectacular. The world will thank you for the ass raping and ask you for another, please.
Re: Already paid (Score:2)
Sorry but using your logic, everything you read on the Internet is a copyright violation. After all, the article needs to be copied into memory in order to display it on your screen. Copyright gives the owner certain privileges. However, it does not allow the holder to arbitrarily expand that list of privileges to include whatever they dream up.
Re: (Score:2)
Re: (Score:2)
Sorry but using your logic, everything you read on the Internet is a copyright violation.
As much as I agree with your position, the GP is right.
Any electronic copy is technically a violation of the rightsholder's copyright(s) under the law. It's for this reason that all software licenses (especially proprietary ones) come with language that says something to the effect of "temporary copies made for the purposes of running / using the content in compliance with the provisions of this license are permitted." If they didn't, then simply installing or running the software would allow the rightsh
Re: (Score:2)
Because you are not "copying" it. The copy that your eyes and brain are making is not considered a "copy".
The "AI" trainers are trying to claim certain exceptions to the law, based on the nature of the output. But they don't really hold up. You can't get around the fact that the training copy was unauthorized.
What do you believe is the relevant legal difference? Where in copyright law are human beings differentiated from machine learning algorithms?
Re: (Score:2)
Scanning an article is not a copyright violation. The violation comes with what you do with your copy. If it's just to store privately, you're fine. If you plan on distributing it (commercially or not) then you're in trouble.
Re: (Score:2)
There are issues about the output of these AI tools as well.
There was a case in the UK a few years back where a photographer won a copyright infringement claim against someone for making a very similar photograph to his. It was of a red London bus crossing a bridge. The other person went to the same spot and used the same composition with a similar red bus. The court found that they had copied the essential elements of the photo.
AI image generators that produce output which are very similar to work they wer
Re: (Score:2)
Fair use and reading a published article have little to to with each other. When reading, if you aren't making a copy yourself, YOU are not committing copyright violation. Unless it's one of those "contagion" laws where if you are in the chain anywhere you are in violation, but I don't think so.
Re: (Score:2)
Not in the least. Newspapers own the copyright on the articles they publish and they don't have to put anything in their terms of service to enforce it. When you visit a newspaper website you pay for access to the articles, either by (theoretically) watching ads or by paying for a subscription. That access does not include permission to copy the articles and republish them.
It is not necessary to seek permission to remember facts and benefit from information presented in an article. Neither is there such a thing as copyright on facts.
They do not have the legal right to copy and republish the entire article.
They have all the right in the world to benefit from and produce transformative works based on information learned from an article.
Newspapers have been selling access to their historical records for decades. This is nothing new to them or to the courts. If the AI engines want to build databases off published articles en-mass they are going to have to pay up.
If I pay a know-it-all to sit down and answer my questions from memory they have no legal requirement to compensate their original sources or to get their permission to divulge information learned from articles the kn
Re: (Score:2)
It is not necessary to seek permission to remember facts and benefit from information presented in an article.
Correct. Learning is not copying.
Neither is there such a thing as copyright on facts.
Correct. Copyright applies to text, not ideas. Lets consider a situation where you want to reference a dictionary definition. Dictionary definitions are copyright protected, but fair use generally allows quoting short sections of text. Lets assume the definition is three paragraphs long, so you aren't sure you can copy it verbatim. If you restate the definition you are not copying, but creating a new product.
They have all the right in the world to benefit from and produce transformative works based on information learned from an article.
Correct. Learning is not copying.
If I pay a know-it-all to sit down and answer my questions from memory they have no legal requirement to compensate their original sources or to get their permission to divulge information learned from articles the know-it-all has previously read.
Correct. Learning is not copy
Re: (Score:2)
"Learning" in the sense of Machine Learning is transformative, not derivative.
In fact, as others have stated, GPT's main value is it's ability to "understand" things, rather than its ability to recall facts or quotes from articles or books.
Re: (Score:2)
GPT's main value is it's ability to "understand" things, rather than its ability to recall facts or quotes from articles or books.
Those who believe a large language model is capable of understanding are mistaken. Someone shared this link [stephenwolfram.com] to an article by Stephen Wolfram last week. It's a good quick overview of how large language models work.
Re: (Score:2)
The news has already been paid for by the ads that were served while the AI was training on the content. The news sites should have included restrictions against AI in their TOS but in the end it's all water under the bridge, the cat's out of the hat.
Very true.
The problem with News outlets wanting AI generated content paid for is that the AI content isn't worth shit. Do the News outlets want credit and blame for that?
It's just algorithmically generated quasi content that reads like a 9th grader trying to come up with a 1000 word essay by padding out 100 words with 90 percent meaningless puff.
I gave a try at this incredible and disruptive technology that was going to eliminate almost everyone's jerbs, and after laughing a bit at the inanity, it w
Re: (Score:2)
I love how every time there are these articles about ChatGPT someone always posts a snarky reply about how THEY tried it and it was no where near good enough to be useful (and gosh, just so slow!),
Meanwhile, the next generation of workers are already learning how to use it effectively and improving their productivity.
Re: (Score:2)
I love how every time there are these articles about ChatGPT someone always posts a snarky reply about how THEY tried it and it was no where near good enough to be useful (and gosh, just so slow!),
Some folks get a bit triggered whe their ox gets gored, eh?
Meanwhile, the next generation of workers are already learning how to use it effectively and improving their productivity.
Okay, you being an expert on ChatGPT, I'll ask you to educate me in the error of my ways. You should be easily able to provide us all with their successes, and references to them. Thanks for letting us know. None of us want to be saying the wrong things, so it is your task to show us it's success.
Re: (Score:2)
Hows these apples. Your both talking nonsense.
ChatGPT *IS* producing SOME high value work. Its not perfect, some of it is absolute gibberish, but its a technology thats barely a year old building on a theory thats barely 6 years old. Things are moving extremely fast.
But precisely because the technology is producing high value work is why its a net loss for humanity. Its high value, but its not human value. The internet is clogging up with machine generated drivel thats well written enough that its replacing
Re: (Score:2)
FWIW, I'm already using chatGPT to delay hiring another person in my solo practice. When I do hire, I will expect the person to use chatgpt to handle the noise level work like simple coding and business emails.
----
The future arrives too soon and in the wrong order. Capitalism has been shedding workers through technology, Financial engineering and exploiting cost differentials around the globe. LLM systems are only accelerating the rate at which capitalism sheds workers.
The natural outcome is that we will ha
Re: (Score:2)
Here are four ways I use ChatGPT successfully.
1. Summarizing articles: I don't have time to read. If the some reason interesting, I move on. If it is, I go to the original material. Saves me a lot of time.
2. Generating a skeleton for writing: There is an awful lot of boilerplate in writing and chat GPT can crank it out like nobody's business and I can tell what to keep, what to toss, and what to change. ChatGPT is more consistent than I am through a body of work but it also lets me capture my thoughts, and
Re: (Score:2)
"If the some reason interesting, I move on."
Are you using a speech-to-text tool?
Re: (Score:2)
The problem with News outlets wanting AI generated content paid for is that the AI content isn't worth shit. Do the News outlets want credit and blame for that?>
They don't want "credit". They want to be paid each time their material is ingested ("copied" is the legal term) into the "AI". (They don't care how it is used after that, unless recognizable literal fragments are going to be output - which would be additional "copying".)
The "AI" company illegally made copies of the newspapers. The newspapers want their statutory (or other) damages as specified by the Copyright law. Going foward after that, the newspapers probably will offer the "AI" company a license for l
Re: (Score:2)
The problem with News outlets wanting AI generated content paid for is that the AI content isn't worth shit. Do the News outlets want credit and blame for that?>
They don't want "credit". They want to be paid each time their material is ingested ("copied" is the legal term) into the "AI".
You know what I mean. Anyhow, as I noted before, I'm certain that other groups will provide their own news for free, so the US news outlets will be able to breathe easily.
Think maybe we should re-visit the old idea that posting links on the internet should be made illegal?
Re: (Score:2)
>It's just algorithmically generated quasi content that reads like a 9th
>grader trying to come up with a 1000 word essay by padding out
>100 words with 90 percent meaningless puff.
Wait, I thought that was the New Yorker.
Oh, I see now. There's an about a two grade level difference . . .
hawk
Re: (Score:2)
>It's just algorithmically generated quasi content that reads like a 9th >grader trying to come up with a 1000 word essay by padding out >100 words with 90 percent meaningless puff.
Wait, I thought that was the New Yorker.
Oh, I see now. There's an about a two grade level difference . . .
hawk
Boom! 8^)
Re: (Score:2)
Re: (Score:2)
You think the cats are out of the hat? Just wait till they get to the VOOM!
How that VOOM made us ZOOM...
What if (Score:5, Insightful)
Re: (Score:2)
I read their articles (or in the case of books) and used this information to become successful and make money? Do I owe the publishers or authors a royalty?
You don't owe anything because you didn't make any copy. Reading into your brain is not making a copy. Scanning an article into a computer is making a copy, and there are ("Copyright") laws about doing that. The scanning that OpenAI did was not authorized under the law. It was an illegal copying.
The owner of the copyright doesn't have to prove damages to get compensated for illegal copying; just prove that a copy was made, For each article,the statutory damage is up to $30,000. There are also additional re
Last gasps of a dying industry.... (Score:3, Funny)
News is no longer news, it's propaganda.
Re: (Score:3)
Re: (Score:2)
Re: (Score:2, Flamebait)
This exists but people don't actually want that (and neither do you probably).
NPR, PBS, Reuters, AP all put out pretty straight news but "All Things Considered" doesn't get the ratings that Tucker or Hannity or Maddow or whatever wag and crank is going off on today.
Re: (Score:2)
This exists but people don't actually want that (and neither do you probably).
NPR, PBS, Reuters, AP all put out pretty straight news
Those are heavily biased sources, but you don't see that because it agrees with your biases. Because you were raised on those sources and think it's normal.
Re: (Score:2)
Just saying it does not make it true and you don't know what I was raised on.
Look at NPR's front page and give me an example of a "heavily biased" story otherwise we are just arguing over feelings here.
Re: (Score:3)
This means fucking nothing. Everybody on earth has biases, it's part of being human but all those outlets either minimize bias or are sure to represent both end of an issue. Have you actually listened to NPR ever? They are always going out of their way to have both points of a story, having representatives both parties online for a story, or both opinions and the news breaks are dry and to the point.
You are advocating for a thing that doesn't actually exist, never actually existed and nobody actually want
Re: (Score:2)
Re: (Score:2)
It's now standard practice for any smart person to consume multiple streams of information reporting on the same events
Nobody, and I mean nobody is disputing this is good advise, it's practically tautological. Have I made an argument against this? This is classic Motte and Bailey [wikipedia.org]
I don't think there has ever been a time where this has not been true. Can you give an example of a time in history when there could be a singular source on truth in either new, or events or really anything?
The world you are trying to hold on to is a fiction.
I've just explained reality, you are assuming I have some "world" to hold onto. You are the one here arguing for an ideal that doesn't curren
Re: (Score:2)
Re: (Score:2)
That's just the thing, you addressed nothing.
You can take a look at the charts with progressive buzzword usage across major mainstream news.
Share with the class, where are these charts, what do they reference. You are hanging this entire premise off "buzzwords"?
This is sounding all "vibes" based. My vibes says conservative media literacy is what jumped the shark and anything not dripping with liberal disdain is immediately thrown out because your brains have been cooked by decades of talk radio style sensationalism. But, that's my feelings.
Re: (Score:2)
Having worked in the news industry (radio in Washington, D.C.) in the 1970s, I would say that the bias is much worse in this era. This is partly due to political thought and institutional indoctrination, but also due to a change in the ethics and concepts of what "news reporting" is supposed to be.
Re: (Score:2)
I've recently unsubscribed from NPR because they started pandering rather "trendy" bs I just couldn't digest.
Re: (Score:2)
Yet no examples given...
I Want Money For Using My Data Data. (Score:2)
I also want some more money for being disabled compared to AI. Pay up, bitches!
Re: (Score:2)
The data about my data.
"Its" data? (Score:5, Insightful)
Reddit, the popular social message board, has met with top generative AI companies about being paid for its data
At what point did Reddit pay the redditors anything to create "its" data?
Oh yeah that's right: they gave it all away for free in exchange for karma points...
Re: (Score:2)
Reddit, the popular social message board, has met with top generative AI companies about being paid for its data
At what point did Reddit pay the redditors anything to create "its" data?
Oh yeah that's right: they gave it all away for free in exchange for karma points...
Well, yes they did and it's all perfectly legal, and Reddit probably does have a Copyright claim.
Each of your posts might have been worth $750 to you, had you not just given it away. When Reddit wins all this money, I wonder if users will sue Reddit. The answer so that question is complicated, but the part about Reddit successfully suing OpenAI is much more straightforward.
The cost of one subscription (Score:4, Informative)
That's all the news publishers can really hope to charge a generative AI company for using its content. It's available to every other reader for a few dollars per year. Why should a generative AI cost any more when it reads the same articles?
Of course, the company may need to acquire a number of subscriptions but that wouldn't be more than a few thousands dollars per year for nearly every paper out there. Or less if they just cut out the middleman and got stories directly from AP and Reuters.
Re: (Score:2)
First - we already have precedence: the fee for a library copy is already significantly more than a single-user copy for traditional printed media. Those library copies are modeled on a multiple, but limited usage. In the case of LLMs the usage is vastly larger than the a traditional library and deserves to be negotiated higher.
Second - LLMs aren't just 'reading' the material, they are republishing (indirectly) for *every single response they generate*, and here's th
Re: (Score:2)
If a researcher cites a body of AP articles as a source for a published paper that appears on multiple journals and is read by thousands, does that researcher owe more money to the AP than a regular subscriber?
Re: (Score:2)
If a researcher cites a body of AP articles as a source for a published paper that appears on multiple journals and is read by thousands, does that researcher owe more money to the AP than a regular subscriber?
OpenAI did not "cite" anything. They made a copy of the material -- they "scanned it in" to their model. The analogy would be that your scientific paper literally copied and included significant (or whole) parts of the original material. And for commercial use, the case is even more strict.
If you ever published anything at all, presumably you would comprehend this. Is this not taught in the 7th grade when they start asking you to write homework papers and book reports?
Re: (Score:2)
>LLMs aren't just 'reading' the material, they are republishing (indirectly)
the "indirectly" is VERY important: the word you are looking for is "transformative".
Re: (Score:2)
Re: (Score:2)
If that is true, then if I get ChatGPT to read in the Linux source code, and output something LInux-like, is that still GPL? Technically it should be, but because ChatGPT "transformed" it, it's no longer under copyright, making the GPL irrelevant.
Which means if I want a Linux-like operating system without the hassles of the GPL, I can have a LLM
Re: (Score:2)
I've played around with getting ChatGPT to write C++ code: it won't write a high-quality OS for you, I'm afraid.
Of course, just because a thousand monkeys manage to type out "The Godfather" doesn't mean you still can avoid copyright. Same as LLMs that generate images. "Accidentally" transforming something into a duplicate of a copyrighted work doesn't remove the copyright. Even a random letter function has the capability to generate copyrighted work.
Re: (Score:2)
LLMs aren't just 'reading' the material, they are republishing (indirectly)
the "indirectly" is VERY important: the word you are looking for is "transformative".
The mere act of scanning in the original material, before it goes into the blender, was most likely a Copyright violation. That would be the subject of enormous statutory damages, full stop.
Since what comes out of the "AI" does not include literal fragments, the output is probably not a Copyright issue. There is no identifiable correspndance between the original article and the AI output, so that's not "transformative" in the usual sense. Plagiarism is not, strictly speaking, a crime. However, there are oth
Re: (Score:3)
Re: (Score:2)
That's all the news publishers can really hope to charge a generative AI company for using its content. It's available to every other reader for a few dollars per year.
Why should a generative AI cost any more when it reads the same articles?
Because that's not how Copyright law works.
The newspaper readers did not make unauthorized copies. But just because something is available (in a book store, on the web, in a public library) does not mean it may be freely copied.
OpenAI made illegal copies.
If one can show that a work (e.g. a newspaper article) was copied without legal authorization, the law automarically provides "statutory" damages. The amount per article is between $750 and $30,000.
Not just newspaper articles. Every single bit of text that
EXTORTION for LINKING is done. (Score:2)
SItes that have content have many methods of limiting who can access it.
When they use paywalls and adblockerwalls they're just extorting would-be viewers.
Just like Google and FB pulled out of several markets, and now Twitter is talking
about pulling out of EU because of the DSA... the companies running these scapers,
be they search engines or so-called AI have the same choice to make.
If Big Media doesn't want its content viewership to go up, they are doing exactly
the right thing. Search engines didn't go awa
In other news... (Score:2)
Newspapers to get law passed so that a tax is placed on all comments using letters. You didn't think they'd just let you keep using them for free did you?
Seriously, this has to be some of the dumbest shit ever.
Re: (Score:2)
I want everyone who made me feel bad executed also. Dumb enough for ya?
Re: (Score:2)
Newspapers to get law passed so that a tax is placed on all comments using letters. You didn't think they'd just let you keep using them for free did you?
Seriously, this has to be some of the dumbest shit ever.
I do know that if The US decides that AI generation must pay for it to be used, there are other countries like Russia, Iran, and China who will provide free AI content in it's place. What could go wrong?
Consequences, consequences,
Andrew Yang not so silly now (Score:2)
Wot? (Score:2)
Once you put something in public, it's not really yours any more.
Yes, non can tell it's is his: we have copyright for this.
But if I use your public text for my own amusement, or use it to inspire myself for an opera or to train my AI engine, well, it's not your business. Also because you cannot prove my AI thing is actually using your text.
Re: (Score:2)
No legal basis (Score:2)
From US government copyright office [copyright.gov]
Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed.
So, unless the AI is spitting out text verbatim, there's no copyright infringement.
A question for those "long" on LLMs (Score:2)
If commercial LLM operations won't turn a profit, why are those who did the work that is the only reason those LLMs have even *potential* value the ones who should not be paid when their works value is known and the work of operating the commercial LLM is speculative?
I can easily believe software code or artworks created by those with decades of experience to achieve continually improved quality that gets used as data for LLMs up has a value that some idiot in a suit talking buzzwords at press conferences d
It's not PropagandaGPT (Score:2)
OpenAI doesn't use propaganda in its database.
Yes, but when does news media pay us (Score:2)
Re: (Score:2)
Re: (Score:2)