How Do You Define 'Open Source AI'? (arstechnica.com) 37
An anonymous reader quotes a report from Ars Technica: The Open Source Initiative (OSI) recently unveiled its latest draft definition for "open source AI," aiming to clarify the ambiguous use of the term in the fast-moving field. The move comes as some companies like Meta release trained AI language model weights and code with usage restrictions while using the "open source" label. This has sparked intense debates among free-software advocates about what truly constitutes "open source" in the context of AI. For instance, Meta's Llama 3 model, while freely available, doesn't meet the traditional open source criteria as defined by the OSI for software because it imposes license restrictions on usage due to company size or what type of content is produced with the model. The AI image generator Flux is another "open" model that is not truly open source. Because of this type of ambiguity, we've typically described AI models that include code or weights with restrictions or lack accompanying training data with alternative terms like "open-weights" or "source-available."
To address the issue formally, the OSI -- which is well-known for its advocacy for open software standards -- has assembled a group of about 70 participants, including researchers, lawyers, policymakers, and activists. Representatives from major tech companies like Meta, Google, and Amazon also joined the effort. The group's current draft (version 0.0.9) definition of open source AI emphasizes "four fundamental freedoms" reminiscent of those defining free software: giving users of the AI system permission to use it for any purpose without permission, study how it works, modify it for any purpose, and share with or without modifications. [...] OSI's project timeline indicates that a stable version of the "open source AI" definition is expected to be announced in October at the All Things Open 2024 event in Raleigh, North Carolina.
To address the issue formally, the OSI -- which is well-known for its advocacy for open software standards -- has assembled a group of about 70 participants, including researchers, lawyers, policymakers, and activists. Representatives from major tech companies like Meta, Google, and Amazon also joined the effort. The group's current draft (version 0.0.9) definition of open source AI emphasizes "four fundamental freedoms" reminiscent of those defining free software: giving users of the AI system permission to use it for any purpose without permission, study how it works, modify it for any purpose, and share with or without modifications. [...] OSI's project timeline indicates that a stable version of the "open source AI" definition is expected to be announced in October at the All Things Open 2024 event in Raleigh, North Carolina.
Define? They'll probably be defined as "illegal" (Score:3, Interesting)
Re: (Score:1)
Re: (Score:3)
Re: (Score:1)
Re: (Score:2)
Re: (Score:1)
Re: (Score:1)
Why so defensive? The only person talking politics here is you.
Re: (Score:2, Flamebait)
Re: (Score:2)
If AI is trained with public data and conservatives are in the minority then yes it stands to reason the AI would lean left.
Just tell us all you never thought of what would happen when your partisan enemies got their hands on open source LLM code? Did you assume the AI/LLM landscape would stay "progressive" forever and I just ruined your day with politics? I don't think so, "Comrade".
What stake do you think I have in the supposed political leanings of software?
Re: (Score:2)
Why so defensive? The only person talking politics here is you.
Reality's Spokesperson, apparently..
It's a dream job, if you can get it..
Re: (Score:2)
"My guess is that once any open source LLMs start to reach political parity..."
"Open source LLMs" do not strive to "reach political parity" nor is there any reason to believe they will or they won't. What is "left" and "right" are arbitrary and the fact that an LLM may be "pretty far left", if that were even true, is a commentary of what is considered left or right, not a reflection of any failure in creating the model. When the right wing is defined by hatred you should not expect AI to reflect that hatr
Re: (Score:2)
nor is there any reason to believe they will or they won't.
In the article [realclearscience.com] I linked, they use some fairly deterministic methods to draw their conclusions. Do you disagree with them?
And yet we employ an entire FBI, that sane people realize is essential, literally to draw those very conclusions.
Uhh, no. The FBI releases the data. They draw very few conclusions from the data otherwise we'd have racial profiling on our hands since 13% of the population is committing 53% of the homicides and 20%-30% of property crime. Every time an LLM ingests that data you can bet there is some wokester right there handwaving saying "Wait wait wait, it's more complex and nuanced. Let me override a
Re: (Score:1)
Re: (Score:2)
Only when he was a democrat
Re: (Score:3)
LLMs don't have a bias, they generate output based on how you phrase your prompt. If you make it sound like the same sort of talking points that right-wingers are constantly spouting all over the 'net, you'll get a response that sounds exactly like the sort of rebuttals you'll see coming from the left. You have to make your prompts sound more echo chamber-y, and then it absolutely will respond the way you want it to. For example:
What are some of the criticisms against [liberal policy]?
What are some of th
Re: (Score:2)
Re: (Score:2)
Yes, their methodology is flawed because a LLM doesn't actually think, no matter how much some people try to anthropomorphize it. If you don't specifically specify what type of output you're expecting, it's either going to look for patterns that emerged most frequently in its training data or use a RNG.
First draft of a comedy song I asked ChatGPT to write about a hypothetical alternate reality where the YouTube celebrity MrBeast was actually a Wildebeest turned out absolutely terrible. It was entirely abo
Re: (Score:2)
Re: (Score:2)
LLM's aren't "pretty far left" they are completely neutral and just reflect the status quo of truth drowns out garbage, most of the time.
I say most of the time, because the LLM's don't understand anything they are saying. It's like asking a parrot a question, and it was only trained to say things it's human owner said without the context or understanding of it. This is why parrots and other corvids sound like a "record" because they are imitating the sounds, not saying words.
An Open Source AI, would need th
Re: (Score:2)
LLM's aren't "pretty far left" they are completely neutral and just reflect the status quo of truth drowns out garbage, most of the time.
Well, the academic folks who've looked [plos.org] at the question don't agree with you at all. They've been able to deterministically find exactly where these LLMs sit on many different political spectrum. They ask political questions and the LLMs answer. How they come up with the answers and why (training vs programming) isn't really as important as the outcomes. The abstract of that PLOS one journal article says "When probed with questions/statements with political connotations, most conversational LLMs tend to gen
Free artificial moron with source? (Score:3)
For the models: Clearly the only "open" version is when the training data is included.
Source code , source data and license (Score:3, Insightful)
Re: (Score:2)
training data
You can't slap a GPL or other open-source or whatever other license on something that isn't yours.
Re: (Score:1)
Re: (Score:2)
wrong question (Score:2)
It's like debating what the definition of open source nuclear weapons is. Open source is about who benefits and in what ways, it's just not that interesting a question to ask about AI.
Open source democratizes software, but that is predicated on the common person having the ability to use it and improve it. You may be able to compile gcc and adapt it to your variant of RISC-V, and that would be quite an accomplishment, but that's a long way from training your own version of full blown LLMs. Not many people
Re: (Score:2)
They can evolve on their own planet and form their own legal framework. They don't get to use ours.
Definition? (Score:2)
How Do You Define 'Open Source AI'? (Score:5, Informative)
How Do You Define 'Open Source AI'?
Pretty much as being everything that OpenAI [openai.com] is not.
How Do You Define 'Open Source AI'? (Score:2)
It's a no-go since the beginning (Score:2)
"The group's current draft (version 0.0.9) definition of open source AI emphasizes "four fundamental freedoms" reminiscent of those defining free software: giving users of the AI system permission to use it for any purpose without permission, study how it works, modify it for any purpose, and share with or without modifications."
Models which give usage permission for any purpose or permit modification for any purpose will be banned by law anyway, and technical barriers to prevent these practices will be m
Re: (Score:2)
Models which give usage permission for any purpose or permit modification for any purpose will be banned by law anyway,
Please change "any purpose" to "all purposes" in the above comment. English isn't my first language.
Wasted effort (Score:2)
To address the issue formally, the OSI -- which is well-known for its advocacy for open software standards -- has assembled a group of about 70 participants, including researchers, lawyers, policymakers, and activists. Representatives from major tech companies like Meta, Google, and Amazon also joined the effort
To address the issue formally, instead of assembling 70 AI experts, scientists, and tech industry representatives who believe AI is the way forward, just assemble 3 AI chatbots and ask them to generate the conceptual framework and the documentation. Then choose one at random, feed it all three bot-docs, and ask it to create a combined executive summary of all 3.
This can be done by 1 person in about 30 minutes and the other 69 experts can be rightsized.
You can rebuild it yourself (Score:2)
It' not just the weights, it's the code and the training data so if you had the hardware, you could re-train it yourself.
Otherwise is just royalty-free.