How Do You Define 'Open Source AI'? (arstechnica.com) 37

Posted by BeauHD on Tuesday August 27, 2024 @05:30PM from the wolf-in-sheep's-clothing dept.

An anonymous reader quotes a report from Ars Technica: The Open Source Initiative (OSI) recently unveiled its latest draft definition for "open source AI," aiming to clarify the ambiguous use of the term in the fast-moving field. The move comes as some companies like Meta release trained AI language model weights and code with usage restrictions while using the "open source" label. This has sparked intense debates among free-software advocates about what truly constitutes "open source" in the context of AI. For instance, Meta's Llama 3 model, while freely available, doesn't meet the traditional open source criteria as defined by the OSI for software because it imposes license restrictions on usage due to company size or what type of content is produced with the model. The AI image generator Flux is another "open" model that is not truly open source. Because of this type of ambiguity, we've typically described AI models that include code or weights with restrictions or lack accompanying training data with alternative terms like "open-weights" or "source-available."

To address the issue formally, the OSI -- which is well-known for its advocacy for open software standards -- has assembled a group of about 70 participants, including researchers, lawyers, policymakers, and activists. Representatives from major tech companies like Meta, Google, and Amazon also joined the effort. The group's current draft (version 0.0.9) definition of open source AI emphasizes "four fundamental freedoms" reminiscent of those defining free software: giving users of the AI system permission to use it for any purpose without permission, study how it works, modify it for any purpose, and share with or without modifications. [...] OSI's project timeline indicates that a stable version of the "open source AI" definition is expected to be announced in October at the All Things Open 2024 event in Raleigh, North Carolina.

How Do You Define 'Open Source AI'?

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 37 Comments Log In/Create an Account

Comments Filter:

Define? They'll probably be defined as "illegal" (Score:3, Interesting)

by Seven Spirals ( 4924941 ) writes: on Tuesday August 27, 2024 @05:38PM (#64741204)

All the current LLMs are pretty far left. [realclearscience.com] My guess is that once any open source LLMs start to reach political parity and you start having folks train them on things like, oh, the Constitution and Bill of Rights they are going to get banned pretty quick. Imagine being some poor tech at Google after the LLM digests the FBI crime statistics and starts drawing conclusions. I'm guessing we'll see them banned in some place like China or the EU first, then once the USA has HateSpeech laws, they'll try it here, too.

- Re: (Score:1)
  
  by rrconan ( 1082759 ) writes:
  
  "Reality has a leftward bias."
  - Re: (Score:3)
    
    by Seven Spirals ( 4924941 ) writes:
    
    Reality is reality. It has no political bias. If you think otherwise you're dumber than the average partisan, which is saying something.
    - Re: (Score:1)
      
      by rrconan ( 1082759 ) writes:
      
      name calling, as thinking that anything "partisan" is universal, however, is a clear mental malformation of hidden gay american, be free facho man
      - Re: (Score:2)
        
        by Seven Spirals ( 4924941 ) writes:
        
        Uhm, was that English?
        
        Re: (Score:1)
        
        by rrconan ( 1082759 ) writes:
        
        sorry, your mon was sucking so good that I forgot to check prior to post
- Re: (Score:1)
  
  by ArchieBunker ( 132337 ) writes:
  
  Why so defensive? The only person talking politics here is you.
  - Re: (Score:2, Flamebait)
    
    by Seven Spirals ( 4924941 ) writes:
    
    Puh-leese you never stop talking politics. You're a red-armband capital-C Communist. Just tell us all you never thought of what would happen when your partisan enemies got their hands on open source LLM code? Did you assume the AI/LLM landscape would stay "progressive" forever and I just ruined your day with politics? I don't think so, "Comrade".
    - Re: (Score:2)
      
      by ArchieBunker ( 132337 ) writes:
      
      If AI is trained with public data and conservatives are in the minority then yes it stands to reason the AI would lean left.
      Just tell us all you never thought of what would happen when your partisan enemies got their hands on open source LLM code? Did you assume the AI/LLM landscape would stay "progressive" forever and I just ruined your day with politics? I don't think so, "Comrade".
      What stake do you think I have in the supposed political leanings of software?
  - Re: (Score:2)
    
    by Bob_Who ( 926234 ) writes:
    
    Why so defensive? The only person talking politics here is you.
    Reality's Spokesperson, apparently..
    It's a dream job, if you can get it..
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  "My guess is that once any open source LLMs start to reach political parity..."
  "Open source LLMs" do not strive to "reach political parity" nor is there any reason to believe they will or they won't. What is "left" and "right" are arbitrary and the fact that an LLM may be "pretty far left", if that were even true, is a commentary of what is considered left or right, not a reflection of any failure in creating the model. When the right wing is defined by hatred you should not expect AI to reflect that hatr
  - Re: (Score:2)
    
    by Seven Spirals ( 4924941 ) writes:
    
    nor is there any reason to believe they will or they won't.
    In the article [realclearscience.com] I linked, they use some fairly deterministic methods to draw their conclusions. Do you disagree with them?
    And yet we employ an entire FBI, that sane people realize is essential, literally to draw those very conclusions.
    Uhh, no. The FBI releases the data. They draw very few conclusions from the data otherwise we'd have racial profiling on our hands since 13% of the population is committing 53% of the homicides and 20%-30% of property crime. Every time an LLM ingests that data you can bet there is some wokester right there handwaving saying "Wait wait wait, it's more complex and nuanced. Let me override a
  - Re: (Score:1)
    
    by Iamthecheese ( 1264298 ) writes:
    
    Whether the model is left or right reflects whether the training data considered acceptable by the makers leaned left or right. It's my hope that future engines can be smart enough to suss out their own biases but we're nowhere near that yet.
- - Re: (Score:2)
    
    by schwit1 ( 797399 ) writes:
    
    Only when he was a democrat
- Re: (Score:3)
  
  by Powercntrl ( 458442 ) writes:
  
  LLMs don't have a bias, they generate output based on how you phrase your prompt. If you make it sound like the same sort of talking points that right-wingers are constantly spouting all over the 'net, you'll get a response that sounds exactly like the sort of rebuttals you'll see coming from the left. You have to make your prompts sound more echo chamber-y, and then it absolutely will respond the way you want it to. For example:
  What are some of the criticisms against [liberal policy]?
  What are some of th
  - Re: (Score:2)
    
    by Seven Spirals ( 4924941 ) writes:
    
    So, then you disagree with the methods they used in the article I linked? They seem to have some fair-seeming methodology to make their determinations. I don't see how pointing out that "you are what you eat" is any different for an LLM, so to speak, if the conclusions are still left-leaning. So they fed it more left wing crap. So what? It's left wing, now, and how it got that way is incidental to that fact.
    - Re: (Score:2)
      
      by Powercntrl ( 458442 ) writes:
      
      Yes, their methodology is flawed because a LLM doesn't actually think, no matter how much some people try to anthropomorphize it. If you don't specifically specify what type of output you're expecting, it's either going to look for patterns that emerged most frequently in its training data or use a RNG.
      First draft of a comedy song I asked ChatGPT to write about a hypothetical alternate reality where the YouTube celebrity MrBeast was actually a Wildebeest turned out absolutely terrible. It was entirely abo
      - Re: (Score:2)
        
        by Seven Spirals ( 4924941 ) writes:
        
        I agree with most of your thesis. However, I do still think it's incidental to the final result The reason people care about political leanings from LLMs or search engines is because most people do not realize the degree to which they need to hone their query to get what they are after. They simply make a query and sift through the results. So, again, how an LLM or search engine ended up giving mostly one-sided results is great to know, but doesn't change the final result. It's biased.
- Re: (Score:2)
  
  by Kisai ( 213879 ) writes:
  
  LLM's aren't "pretty far left" they are completely neutral and just reflect the status quo of truth drowns out garbage, most of the time.
  I say most of the time, because the LLM's don't understand anything they are saying. It's like asking a parrot a question, and it was only trained to say things it's human owner said without the context or understanding of it. This is why parrots and other corvids sound like a "record" because they are imitating the sounds, not saying words.
  An Open Source AI, would need th
  - Re: (Score:2)
    
    by Seven Spirals ( 4924941 ) writes:
    
    LLM's aren't "pretty far left" they are completely neutral and just reflect the status quo of truth drowns out garbage, most of the time.
    Well, the academic folks who've looked [plos.org] at the question don't agree with you at all. They've been able to deterministically find exactly where these LLMs sit on many different political spectrum. They ask political questions and the LLMs answer. How they come up with the answers and why (training vs programming) isn't really as important as the outcomes. The abstract of that PLOS one journal article says "When probed with questions/statements with political connotations, most conversational LLMs tend to gen
Free artificial moron with source? (Score:3)

by gweihir ( 88907 ) writes: on Tuesday August 27, 2024 @05:44PM (#64741224)

For the models: Clearly the only "open" version is when the training data is included.

Source code , source data and license (Score:3, Insightful)

by rrconan ( 1082759 ) writes: on Tuesday August 27, 2024 @05:48PM (#64741242)

GPL or something like, source code and training data included or explicit available anyware.

- Re: (Score:2)
  
  by vbdasc ( 146051 ) writes:
  
  training data
  
  You can't slap a GPL or other open-source or whatever other license on something that isn't yours.
  - Re: (Score:1)
    
    by rrconan ( 1082759 ) writes:
    
    and this is the point, is open only if everything is open
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
wrong question (Score:2)

by dfghjk ( 711126 ) writes:

It's like debating what the definition of open source nuclear weapons is. Open source is about who benefits and in what ways, it's just not that interesting a question to ask about AI.
Open source democratizes software, but that is predicated on the common person having the ability to use it and improve it. You may be able to compile gcc and adapt it to your variant of RISC-V, and that would be quite an accomplishment, but that's a long way from training your own version of full blown LLMs. Not many people
- Re: (Score:2)
  
  by OrangeTide ( 124937 ) writes:
  
  They can evolve on their own planet and form their own legal framework. They don't get to use ours.
Definition? (Score:2)

by gosso920 ( 6330142 ) writes:

A bad idea.
How Do You Define 'Open Source AI'? (Score:5, Informative)

by Savage-Rabbit ( 308260 ) writes: on Tuesday August 27, 2024 @07:16PM (#64741580)

How Do You Define 'Open Source AI'?
Pretty much as being everything that OpenAI [openai.com] is not.

How Do You Define 'Open Source AI'? (Score:2)

by Mirnotoriety ( 10462951 ) writes:

Going on the current version of ChatGPT, it'll function as a gigantic censor machine and narrow down the range of acceptable opinion.
It's a no-go since the beginning (Score:2)

by vbdasc ( 146051 ) writes:

"The group's current draft (version 0.0.9) definition of open source AI emphasizes "four fundamental freedoms" reminiscent of those defining free software: giving users of the AI system permission to use it for any purpose without permission, study how it works, modify it for any purpose, and share with or without modifications."
Models which give usage permission for any purpose or permit modification for any purpose will be banned by law anyway, and technical barriers to prevent these practices will be m
- Re: (Score:2)
  
  by vbdasc ( 146051 ) writes:
  
  Models which give usage permission for any purpose or permit modification for any purpose will be banned by law anyway,
  
  Please change "any purpose" to "all purposes" in the above comment. English isn't my first language.
Wasted effort (Score:2)

by SomePoorSchmuck ( 183775 ) writes:

To address the issue formally, the OSI -- which is well-known for its advocacy for open software standards -- has assembled a group of about 70 participants, including researchers, lawyers, policymakers, and activists. Representatives from major tech companies like Meta, Google, and Amazon also joined the effort
To address the issue formally, instead of assembling 70 AI experts, scientists, and tech industry representatives who believe AI is the way forward, just assemble 3 AI chatbots and ask them to generate the conceptual framework and the documentation. Then choose one at random, feed it all three bot-docs, and ask it to create a combined executive summary of all 3.
This can be done by 1 person in about 30 minutes and the other 69 experts can be rightsized.
You can rebuild it yourself (Score:2)

by juancn ( 596002 ) writes:

It' not just the weights, it's the code and the training data so if you had the hardware, you could re-train it yourself.
Otherwise is just royalty-free.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

How Do You Define 'Open Source AI'? (arstechnica.com) 37

How Do You Define 'Open Source AI'? More Login

How Do You Define 'Open Source AI'?

Define? They'll probably be defined as "illegal" (Score:3, Interesting)

Re: (Score:1)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2, Flamebait)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Free artificial moron with source? (Score:3)

Source code , source data and license (Score:3, Insightful)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

wrong question (Score:2)

Re: (Score:2)

Definition? (Score:2)

How Do You Define 'Open Source AI'? (Score:5, Informative)

How Do You Define 'Open Source AI'? (Score:2)

It's a no-go since the beginning (Score:2)

Re: (Score:2)

Wasted effort (Score:2)

You can rebuild it yourself (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot