Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
AI Open Source Technology

'Openwashing' 40

An anonymous reader quotes a report from The New York Times: There's a big debate in the tech world over whether artificial intelligence models should be "open source." Elon Musk, who helped found OpenAI in 2015, sued the startup and its chief executive, Sam Altman, on claims that the company had diverged from its mission of openness. The Biden administration is investigating the risks and benefits of open source models. Proponents of open source A.I. models say they're more equitable and safer for society, while detractors say they are more likely to be abused for malicious intent. One big hiccup in the debate? There's no agreed-upon definition of what open source A.I. actually means. And some are accusing A.I. companies of "openwashing" -- using the "open source" term disingenuously to make themselves look good. (Accusations of openwashing have previously been aimed at coding projects that used the open source label too loosely.)

In a blog post on Open Future, a European think tank supporting open sourcing, Alek Tarkowski wrote, "As the rules get written, one challenge is building sufficient guardrails against corporations' attempts at 'openwashing.'" Last month the Linux Foundation, a nonprofit that supports open-source software projects, cautioned that "this 'openwashing' trend threatens to undermine the very premise of openness -- the free sharing of knowledge to enable inspection, replication and collective advancement." Organizations that apply the label to their models may be taking very different approaches to openness. [...]

The main reason is that while open source software allows anyone to replicate or modify it, building an A.I. model requires much more than code. Only a handful of companies can fund the computing power and data curation required. That's why some experts say labeling any A.I. as "open source" is at best misleading and at worst a marketing tool. "Even maximally open A.I. systems do not allow open access to the resources necessary to 'democratize' access to A.I., or enable full scrutiny," said David Gray Widder, a postdoctoral fellow at Cornell Tech who has studied use of the "open source" label by A.I. companies.

'Openwashing'

Comments Filter:
  • It helps (Score:5, Informative)

    by drinkypoo ( 153816 ) <drink@hyperlogos.org> on Saturday May 18, 2024 @09:15AM (#64481125) Homepage Journal

    (Accusations of openwashing have previously been aimed at coding projects that used the open source label too loosely.)

    It helps if you know what Open Source means [archive.org]. It means you can see the source.

    If you can get access to the training data and the code that turns it into a model, it's open source regardless of what you're allowed to do with it, or whether you can afford the computer time to build the model from the data. If you can't see the sources, then it's not open source. Not even every definition of Free Software ensures that you will actually be able to use the code in question. That's why there is a GPLv3, with an anti-Tivoization clause; GPLv2 wasn't Free enough. But even the GPLv3 doesn't mandate that you be able to make meaningful use of the code for reasons beyond artificial restrictions, like not owning a supercluster.

    • I drove past the lake in the forest this morning and there was a lovely lady doing open washing. It sure made my day.
    • by Rei ( 128717 )

      Almost nobody in the indie AI community cares about whether the training data for the model is open source. We care about the license restrictions on the model. We can re-finetune or further train a foundation however we want, the question is, what we're allowed to do with it.

      A lot of people just ignore the licenses, but that can come back to bite you, and I don't recommend it.

      • I wrote most of this in 2001 (hard to believe that is almost a quarter century ago):
        https://pdfernhout.net/on-fund... [pdfernhout.net]
        "Consider again the self-driving cars mentioned earlier which now cruise some streets in small numbers. The software "intelligence" doing the driving was primarily developed by public money given to universities, which generally own the copyrights and patents as the contractors. Obviously there are related scientific publications, but in practice these fail to do justice to the complexity of

    • That's true for open source but not for "open" in general. In the case of LLMs, I think it's helpful to think along the lines of open science & especially reproducibility, i.e. Can someone take what you've provided & replicate what you've done in their context? I'd cite the Open Science Framework as a good example of openness: https://osf.io/ [osf.io] It also makes collaboration & community building a whole lot easier.
  • Heâ(TM)s simply seen that theyâ(TM)re ahead on AI research, and wants access to their tech.

    • by jd ( 1658 ) <imipak AT yahoo DOT com> on Saturday May 18, 2024 @09:49AM (#64481171) Homepage Journal

      Nobody is actually ahead in AI, because they're all solving the wrong problem, as indeed AI researchers have consistently done since the 1960s.

      I'm not the least bit worried about the possibility of superintelligence, not until they actually figure out what intelligence is as opposed to what is convenient to solve.

      As for Musk, he's busy trying to kill all engineering projects in America.

      • Huh? What do you mean figure out what intelligence is?

        noun
        1.
        the ability to acquire and apply knowledge and skills.

        There's probably some better definitions but that simple first dictionary result covers it pretty well. We know what intelligence is. We're on the part about how it works and we appear to be solving that quite rapidly.
        • That seems a bit like a circular definition. What does it mean to acquire and apply knowledge and skills, what are the parameters for success or even a positive result?

          Basically if you follow the definitions of knowledge/skills you get right back to perception, information and synonyms of intelligence. So intelligence (in the human sense) is the capacity to acquire and apply intelligence (in the military sense), but we havenâ(TM)t defined concretely what that is, what components that has etc.

          • Something being abstract doesn't make it circular. We could pick many tasks to measure by. This isn't an issue. How it works is the problem being solved.
            • by guruevi ( 827432 )

              I still have to see a measurement or task. If it is merely the collection of data, 1950s databases were the first artificial (electronic) intelligence, and by that definition so would any library going back literally the entirety of written history. But just storing a book to a memory (mechanically, electronically etc) does not make someone intelligent.

              • LOL feigning ignorance isn't very compelling. You've seen plenty of tasks and know it. Imagine trying to pretend you haven't seen AI perform tasks in the last year. Were you living in a cave for the past couple years?
                • by guruevi ( 827432 )

                  Dude, I have been involved in "AI" and ML for at least 15 years now. It is not intelligent. It feigns intelligence by regurgitating string sentences from a database.

                  • Oh so now you know what intelligence is. Funny just a few comments ago you were pretending we couldn't even define it but now not only can you define it you can make assertions about what does and does not have it. You then go on to dismiss intelligence as feigning it with no justification. The only thing feigning here is you. You're the one pretending you haven't seen AI perform a task. 1950s databases weren't responding to natural language language requests to creating new coherent responses derived
                    • by guruevi ( 827432 )

                      I said it is not intelligent. We know what 'not intelligent' means despite not knowing what intelligence is. Regurgitating data from a database is not intelligent.

                      We do need to apply reduction to distill what it means to be intelligent, that is how you make definitions. We don't just repeat string sentences copied in our brains, again, you don't know what you are talking about. That is why in most cases people find examinations where students copy/remember stuff not to be a good sample of their intelligence

                    • Of course we know what's not intelligent means because we know what intelligent means. That's why your denial of that fact is humorously contradictory. You've shot yourself in the foot here.

                      I didn't say we need to apply reduction. I pointed out your misapplication of it.

                      You've demonstrated you're the one without a clue here. You've contradicted yourself. Your assertions are akin to stating the earth is flat and sky is brown while also trying to claim we don't know what brown is. You're not making a
                    • by guruevi ( 827432 )

                      Then feel free to provide a non-self-referential definition of what intelligence is, a nobel prize is awaiting you.

                    • Feel free to scroll up for a perfectly fine definition of the word. You not knowing what words mean is your problem. It isn't holding the rest of the species back.
    • by quonset ( 4839537 ) on Saturday May 18, 2024 @11:44AM (#64481279)

      Musk doesn't care about openness. Look at what he's done to shitter. If anyone says anything mean about him they get their account suspended [imgur.com]. When his Nazi-loving supporters get revealed he can't move fast enough to prevent people from seeing it [mashable.com] while at the same time allowing his Nazi-loving supporters to do the same to others.

      As we know from his flailing car company [jalopnik.com], if you criticize him or the company you lose your ability to buy a car [motoringresearch.com].

      And finally, there's a reason none of his companies have a PR department. He wants everything to go through him, including any announcements about fake products used to inflate the price of company stock. When Musk talks about oppenness you can be sure he has no idea what that term means.

      • Re: (Score:2, Troll)

        Yes look at what he's done for Twitter. He opened it back up for those expressing non government/corporate approved opinions to be expressed openly again at a great cost.

        Eli's suspension was temporary something the previous Twitter wouldn't do if you were guilty of wrongthink. Suspending people for doxxing is now your line in the sand for openness? Get real. You don't actually hold this standard.

        You're deluding yourself about Tesla. Those cars are everywhere and there's only going to be more. You
    • So despite the facts you decided to let your emotions guide you today. This one is a write off better luck tomorrow.
  • There's always something wrong with everything.
    There are those who always bring up the wrong-thing.
    Don't you get tired of those "glass half empty" naysayers?
    Or worse, disguise their desire to control by seemingly bringing up issues to be solved?
    • by jd ( 1658 )

      If there's an issue that needs resolving, it's best to acknowledge it. Hiding away, like Microsoft does with their abysmal records on reliability and security, achieves nothing.

      If honesty is a problem, then neither IT nor science seem good professions. Politics and economics might be better.

  • by jd ( 1658 ) <imipak AT yahoo DOT com> on Saturday May 18, 2024 @09:43AM (#64481167) Homepage Journal

    In neural nets, the network software is not the algorithm that is running. The net software is playing the same role as the CPU in a conventional software system. It is merely the platform on which the code is run.

    The topology of the network plus the state of that network (the data) corresponds to an algorithm. That is the actual software that is being run. AI cannot be considered open until this is released.

    But I flat-out guarantee no AI vendor is going to do that.

    • by Cassini2 ( 956052 ) on Saturday May 18, 2024 @09:58AM (#64481183)
      Essentially,there is no loss for the company in open sourcing the algorithm, because the goal functions, the training data, and the eventual matrices of numeric weights and parameters that are the important bits.
      • What's being "open sourced" in a model like Llama is just the model architecture + weights. Some people prefer the term "open weights".

        The "algorithm" - what the transformer is actually doing - is defined by the weights, which are derived from the training set plus the pre-training procedure (very tricky - not a turn-key process), and maybe a bunch of post-training (even more tricky) which is where a lot of the final model functionality/behavior comes from.

        Even if a company did provide the training data (wh

    • by ceoyoyo ( 59147 )

      But I flat-out guarantee no AI vendor is going to do that.

      A bunch of them have already. Facebook, for example, although they did have some help from a leaker initially.

    • We know OpenAI has a lot (most of it actually) is shaped by hand. The actual base math models are open, itâ(TM)s not even OpenAIâ(TM)s product, but it is the filtering and keyword matching and other things that OpenAI (eg. the way it builds its databases) does that is considered the âalgorithmâ(TM). Just like Twitter is not âthe algorithmâ(TM), we all know how databases work and anyone can apply simple mathematical models to see what *should* be promoted or is viral, the questi

  • by davide marney ( 231845 ) on Saturday May 18, 2024 @09:56AM (#64481179) Journal

    I have yet to see the blanket license agreements that will be needed tet AI companies legally create derivarive works from training data. I have seen copyright license holders such as Sony issue warnings. If these agreements are still being negotiated, no AI company would ever let their data be open for fear of inviting a suit.

    • I have yet to see the blanket license agreements that will be needed tet AI companies legally create derivarive works from training data.

      Derivative works contain recognizably copied elements. There are many uses which don't meet that standard. Just looking like the thing doesn't suit, either, it has to be obviously directly copied (though possibly manipulated) and not recreated.

      • Derivative works contain recognizably copied elements. Just looking like the thing doesn't suit, either, it has to be obviously directly copied (though possibly manipulated).

        I see nothing in copyright law that uses "recognizable" as a criteria. If a book originally written in English is translated into French, it's a derivative work regardless of whether anyone (especially those who don't know French) recognizes it as being translated from the original.

        There are two parts to a derivative work:

        1. Whether t
  • If bags of weights are available to everyone to mess with and use as they please that's good enough in my book. Good luck to anyone seeking to assert any legal restrictions on any bags of weights that do happen to make their way onto the Internet.

  • If you don't know how or why the model comes up with what it comes up with you can't reveal it to anyone else. Perhaps I am not understanding correctly -- wouldn't be the first time. I suppose you can release the initial program but once it has started to do its thing with the training data it becomes pretty much a black box, doesn't it?

  • by Anonymous Coward

    The ulterior motive is it would be very, very convenient for massive "regulation" to be enacted, allowing for ultimate control by the large corporate entities developing these. Wouldn't that be profitable, in many ways. Ugh.

    I don't think their efforts will work. The cat is out of the bag. The world is now changed.

  • Democratic People's Republic of Korea, also known as the North Korean hereditary dictatorship.

My problem lies in reconciling my gross habits with my net income. -- Errol Flynn Any man who has $10,000 left when he dies is a failure. -- Errol Flynn

Working...