AI Can Clone Open-Source Software In Minutes 125

Posted by BeauHD on Wednesday April 01, 2026 @05:00PM from the testing-intellectual-property-law dept.

ZipNada writes: Two software researchers recently demonstrated how modern AI tools can reproduce entire open-source projects, creating proprietary versions that appear both functional and legally distinct. The partly-satirical demonstration shows how quickly artificial intelligence can blur long-standing boundaries between coding innovation, copyright law, and the open-source principles that underpin much of the modern internet.

In their presentation, Dylan Ayrey, founder of Truffle Security, and Mike Nolan, a software architect with the UN Development Program, introduced a tool they call malus.sh. For a small fee, the service can "recreate any open-source project," generating what its website describes as "legally distinct code with corporate-friendly licensing. No attribution. No copyleft. No problems." It's a test case in how intellectual property law -- still rooted in 19th-century precedent -- collides with 21st-century automation. Since the US Supreme Court's Baker v. Selden ruling, copyright has been understood to guard expression, not ideas.

That boundary gave rise to clean-room design, a method by which engineers reverse-engineer systems without accessing the original source code. Phoenix Technologies famously used the technique to build its version of the PC BIOS during the 1980s. Ayrey and Nolan's experiment shows how AI can perform a clean-room process in minutes rather than months. But faster doesn't necessarily mean fair. Traditional clean-room efforts required human teams to document and replicate functionality -- a process that demanded both legal oversight and significant labor. By contrast, an AI-mediated "clean room" can be invoked through a few prompts, raising questions about whether such replication still counts as fair use or independent creation.

AI Can Clone Open-Source Software In Minutes

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 125 Comments Log In/Create an Account

Comments Filter:

Can AI clone lawyers & judges? (Score:5, Funny)

by haruchai ( 17472 ) writes: on Wednesday April 01, 2026 @05:09PM (#66072850)

because it seems there's going to be a lot more IP infringement and it won't be limited to open source

- Re: (Score:3, Interesting)
  
  by Anonymous Coward writes:
  
  From their own website... They offer "Full legal indemnification* *Through our offshore subsidiary in a jurisdiction that doesn't recognize software copyright" So, apparently, problem solved ;-D
  - Re: Can AI clone lawyers & judges? (Score:2)
    
    by liqu1d ( 4349325 ) writes:
    
    ChatGpt says that qualifies as sound legal advice, so we're good. Ship it!
    - Re: (Score:2)
      
      by drnb ( 2434720 ) writes:
      
      ChatGpt says that qualifies as sound legal advice, so we're good. Ship it!
      I know you're joking, but I really hope that the first court where someone tries that has public cameras. The judge's reaction and response will probably be awesome.
      - Re: Can AI clone lawyers & judges? (Score:2)
        
        by liqu1d ( 4349325 ) writes:
        
        I believe there's already been instances of lawyers using LLM output complete with fake citations. Sadly no videos I'm aware of though.
  - Re:Can AI clone lawyers & judges? (Score:5, Insightful)
    
    by homerbrew ( 10094532 ) writes: on Wednesday April 01, 2026 @06:43PM (#66073034)
    
    Of course, I doubt I would call it a clean room design, especially if the AI was trained with that open source project. Once it has seen that original code in it's training, it's quite difficult to convince me that it didn't rely on that code in any way.
    
    - Re: (Score:2)
      
      by CAIMLAS ( 41445 ) writes:
      
      If you can substantially distinguish how tokenized abstraction is any different than natural learning, I'd buy it. But as it's not, I don't think that's a meaningful argument.
      - Re: Can AI clone lawyers & judges? (Score:5, Informative)
        
        by AvitarX ( 172628 ) writes: <me AT brandywinehundred DOT org> on Wednesday April 01, 2026 @11:02PM (#66073338) Journal
        
        Generally if the implementors have seen the original then it's not clean room.
        
        
        Re: (Score:2)
        
        by alexgieg ( 948359 ) writes:
        
        Precisely. Even if one session is fed the explicit code and documents it, then the second session generates code ostensibly based on the documentation generated by the first without having been fed the original code explicitly, the AI underlying both sessions was itself trained on the original code, even if a previous version of it, and holds large chunks of it lossy-compressed within its internal weights, to the point that, with the proper prompting in an entirely unrelated third session, we can get it to
        
        Re: Can AI clone lawyers & judges? (Score:2)
        
        by AvitarX ( 172628 ) writes:
        
        I'm skeptical this company is doing it properly (or even has their own models), but I think you could do this with two models.
        The documenter is trained on all available data.
        The coder is trained but without any copy left code.
        Clean room reverse engineering actually seems like a place where AI will be extremely capable.
        
        Re: (Score:2)
        
        by flink ( 18449 ) writes:
        
        The coder is trained but without any copy left code.
        It costs 10s of millions of dollars to train a big competent LLM. GPT-4 cost ~$74M to train, for example. You can hire a team of human devs who have never looked at the source to do a clean room rewrite the project for a fraction of the cost it would take to develop a "clean" model.
        That said, I could see a use for a model that was only trained on MIT-licensed or public domain code.
        
        Re: (Score:2)
        
        by Deal In One ( 6459326 ) writes:
        
        So have one AI map out how something works, and have another AI recreate it?
        Of cos if the AI was trained on the original.......
    - No memory of first AI session, like clean room (Score:2)
      
      by drnb ( 2434720 ) writes:
      
      Of course, I doubt I would call it a clean room design, especially if the AI was trained with that open source project. Once it has seen that original code in it's training, it's quite difficult to convince me that it didn't rely on that code in any way.
      An AI session does not necessarily train the AI. Two different AI sessions can be like two different people. The second session won't have any memory of the first. So it is effectively like one person writing a clean spec and a second implementing that spec.
      - Re: (Score:3)
        
        by quenda ( 644621 ) writes:
        
        An AI session does not necessarily train the AI. ... The second session won't have any memory of the first. So it is effectively like one person writing a clean spec
        I think he is talking about the actual training of the model, not the inference sessions. LLMs are trained on a LOT of open-source code.
        The big models have read countless terrabytes of copyrighted code (Including GPL) in training. So if you ever ask AI to write a clone of GNU software, it might be legal, but can never be truly clean-room.
    - Re: (Score:3)
      
      by gweihir ( 88907 ) writes:
      
      It does indeed in no way fulfill the requirements for a "clean room" clone. And, worse, even if it did, there was no way to prove that. For a proof of a "clean room" reimplementation, you need to demonstrate conclusively that the implementors never came into contact with the original code.
      Also remember that the result has no now copyright and ownership. It either has none at all or retains the original one.
      - Re: (Score:2)
        
        by Elektroschock ( 659467 ) writes:
        
        That may be the case in the US but not in Europe. There we have interop privileges in the EU software copyright directive.
        This is why your the steward organisation shall be based in Europe or any other place that grants you legal satefy.
        To satisfy US requirements the trick is usually to separate research from implementation.
        that is, you have one project that documents functionality, and another independent project that implements the spec.
        Adversarial interoperability of course needs to get stengthened.
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Interop allows reverse engineering of data, not of code.
    - Re: (Score:2)
      
      by thegarbz ( 1787294 ) writes:
      
      Once it has seen that original code in it's training, it's quite difficult to convince me that it didn't rely on that code in any way.
      The problem with this is a question of learning vs copying. From a copyright point of view virtually all the great artists of history studied under someone else. Should the Cristine Chapel painting be attributed to Domenico Ghirlandaio instead of Michelangelo simply because the latter learned and studied under the former? It will most definitely have the former's influence just like everything has influence in life.
    - Re: (Score:2)
      
      by Elektroschock ( 659467 ) writes:
      
      This really does not matter. Copyright protects only the expession of software. You are free to clone software with independent code.
    - Re: (Score:2)
      
      by allo ( 1728082 ) writes:
      
      If the problem is that the original is in the training data (and memorized) then you should be able to find recognizable parts in the output.
      The point here is more, that you have really fast clean-room reverse engineering, that produces *different* code with the same functionality.
      The next step to make it bullet proof would be to use two distinct models for writing spec and implementing spec.
      The software world is changing, because automated code writing is becoming good. It is interesting where things will
  - Its about use, not where you got it (Score:3)
    
    by drnb ( 2434720 ) writes:
    
    From their own website... They offer "Full legal indemnification* *Through our offshore subsidiary in a jurisdiction that doesn't recognize software copyright" So, apparently, problem solved ;-D
    I don't think it works that way. If I'm using copyrighted software without a license, the source of that software does not change the fact that I am in violation. It's about use, not where you got it.
- Re: (Score:2)
  
  by geekmux ( 1040042 ) writes:
  
  because it seems there's going to be a lot more IP infringement and it won't be limited to open source
  It'll get real interesting when we start finding out the actual Intellectual in all those Property claims, is AI.
- Re: Can AI clone lawyers & judges? (Score:2)
  
  by FudRucker ( 866063 ) writes:
  
  Really, someday anyone with a decently powerful enough desktop will be able to use AI to clone any Microsoft application that runs nativly on Linux
- Re: (Score:2)
  
  by Tony Isaac ( 1301187 ) writes:
  
  I certainly hope so! At least the lawyers!
  If AI is here to replace human labor, lawyers are very, very ripe for cloning. What they do is mostly just filling in blanks and summarizing documents already. AI is really, really good at that kind of thing. And it would be a lot cheaper to use AI than to hire a lawyer.
- Re: (Score:2)
  
  by AmiMoJo ( 196126 ) writes:
  
  I wonder if it's already happened. I can imagine a law firm creating an LLM based on the writings and court transcripts of a particular judge, and then using it to test out different strategies to see which is likely to be most effective with them.
Clean room? (Score:5, Insightful)

by Pinky's Brain ( 1158667 ) writes: on Wednesday April 01, 2026 @05:18PM (#66072886)

Even if you use an AI to extract an extremely condensed specification out of the source code, it's hardly clean room if the LLM was pre-trained on the source code any way.

- Re: (Score:3)
  
  by LainTouko ( 926420 ) writes:
  
  If it really was clean room, they would be able to do replicas of closed source projects just as easily. Or, indeed, just about anything.
  - Re: Clean room? (Score:2)
    
    by AvitarX ( 172628 ) writes:
    
    With open source you can document a lot more than with closed source.
    - Re: (Score:2)
      
      by karmawarrior ( 311177 ) writes:
      
      Assuming the hype is real, I wonder if it's possible to train these algorithms on machine code, or if they need the semantics and expressiveness of a HLL.
  - Re: (Score:2)
    
    by allo ( 1728082 ) writes:
    
    If you write the spec, it can do. The point here is, that the first step is to convert code into a well-written specification. With closed source you need a human to write the specification (or a model working a lot longer using screenshot tools and what ever. I bet we will see such systems in the future).
- Re: (Score:2)
  
  by Pseudonymous Powers ( 4097097 ) writes:
  
  This "clean room", to draw an analogy to medicine, is like if your "sterile surgical dressings" were made of syphilis viruses knitted together into a fabric.
  - Re: (Score:2)
    
    by Pseudonymous Powers ( 4097097 ) writes:
    
    (Well, bacteria, not viruses.)
- Re:Clean room? (Score:5, Interesting)
  
  by Waffle Iron ( 339739 ) writes: on Wednesday April 01, 2026 @08:32PM (#66073176)
  
  Even if you use an AI to extract an extremely condensed specification out of the source code, it's hardly clean room if the LLM was pre-trained on the source code any way.
  I once worked at a place that had a clean room process to create code compatible with a proprietary product. Anybody who had ever seen the original code or even loaded the original binary into a debugger was not allowed to write any code at all for the cloned product. The clone writers generally worked only off of the specifications and user documentation.
  There were a handful of people who were allowed to debug the original to resolve a few questions about low-level compatibility. The only way they were allowed to communicate with the software writers was through written questions and answers that left a clear paper trail, and the answers had to be as terse as possible (usually just yes or no). Everyone knew that these memos were highly likely to be used as evidence in legal proceedings.
  I highly doubt that any AI tech bros have ever been this rigorous, and I'd bet that most of these AIs have been trained on the exact same source code that they are cloning.
  
  - CS textbooks that offer snippets of Linux code (Score:2)
    
    by drnb ( 2434720 ) writes:
    
    I'd bet that most of these AIs have been trained on the exact same source code that they are cloning.
    That may have even happened indirectly. Consider a computer science textbook discussing some classic OS topic and it offers a snippet of Linux code as a sample implementation. I think such exposure would disqualify a person from working on a clean room implementation team.
    - Re: (Score:2)
      
      by martin-boundary ( 547041 ) writes:
      
      No, it's been documented that the actual code from open source projects was downloaded and processed. Look it up. There's no clean room anything going on here, the people mentioned in TFA are deluding themselves.
- Re: (Score:2)
  
  by Tony Isaac ( 1301187 ) writes:
  
  It doesn't have to be "clean room." If the new code is distinctly different from the original, it would be extremely difficult to claim copyright infringement.
  It's not illegal to make a Word Processor just because Word is copyrighted.
- - Re: (Score:3)
    
    by phantomfive ( 622387 ) writes:
    
    We are questioning whether the AI is creating something new, or whether it's just copying the original.
    
    It's a question because the AI was trained on the original. Clean Room Design [wikipedia.org] is a well-known engineering technique. It's not a problem when cloning closed source if the AI wasn't trained on that closed source code.
    - Re: (Score:3)
      
      by Kernel Kurtz ( 182424 ) writes:
      
      It's not a problem when cloning closed source if the AI wasn't trained on that closed source code.
      Seems like a good test then. If it can do that with closed source, there is no logical reason to assume it could not do the same with open source it had never seen, whether it actually had or not. Spin up another and do it again like that just to be sure if you must, but it's probably a done deal at that point.
      - Re: (Score:2)
        
        by phantomfive ( 622387 ) writes:
        
        There doesn't seem to be much to this story. As far as I can tell, they haven't done any scientific or even casual tests to clone open source or closed source. It's vapor ware.
- - Re: (Score:2)
    
    by Pinky's Brain ( 1158667 ) writes:
    
    My point is that all the big models are pretrained on any decent open source project to begin with. That other LLM too.
Software Cloning (Score:5, Interesting)

by silentbozo ( 542534 ) writes: on Wednesday April 01, 2026 @05:23PM (#66072898) Journal

Can it clone proprietary software and turn it into an open source project?
If so, then I think the tradeoff is fair.

- Re: (Score:2)
  
  by unixisc ( 2429386 ) writes:
  
  Maybe have it create and publish the source code of 64-bit Windows 7, w/ no allowances for any assembly language? Then port it to all non-x86 CPUs - RISC-V, Arm, and even legacy NT hardware like Alpha and MIPS
  - PowerPC (Score:2)
    
    by drnb ( 2434720 ) writes:
    
    Maybe have it create and publish the source code of 64-bit Windows 7, w/ no allowances for any assembly language? Then port it to all non-x86 CPUs - RISC-V, Arm, and even legacy NT hardware like Alpha and MIPS
    And PoewrPC. Damn, PowerPC never gets any respect. ;-)
- Re: (Score:2)
  
  by Sloppy ( 14984 ) writes:
  
  So begin the Obfuscated Object Code Compiler wars, to keep robots from writing machine language decompilers. The next few years are gonna be a wild ride!
- Re: Software Cloning (Score:2)
  
  by superposed ( 308216 ) writes:
  
  Can it clone proprietary software and turn it into an open source project?
  I think the answer is no if you don't have clean access to the proprietary software, e.g., if you decompile or reverse engineer it in violation of a license agreement that you agreed to. That taints the spec, which taints the clean room reimplementation. I think this also applies to leaked software - if you know it's someone's trade secret, but you use it anyway to create a competing product, you can be sued.
  - Trade secrets can be lost buy leaks (Score:2)
    
    by drnb ( 2434720 ) writes:
    
    I think this also applies to leaked software - if you know it's someone's trade secret, but you use it anyway to create a competing product, you can be sued.
    It depends on if you were the person who disclosed or used the trade secret in violation of some sort of non-disclosure or maintenance of IP agreement. However once in the wild, other persons may use the info.
    
    If a trade secret is disclosed then the IP protection is lost. The disclosure of a trade secret does not have to be intentional or authorized.
- Re: (Score:2)
  
  by strikethree ( 811449 ) writes:
  
  Can it clone proprietary software and turn it into an open source project?
  If so, then I think the tradeoff is fair.
  There is no tradeoff at all. This process takes Open Source code and turns it into unmaintainable gibberish. This same product also takes Proprietary software and turns it into unmaintainable gibberish.
  Nothing was taken from Open Source, but something was taken from Proprietary.
  You are further from a usable product if you use this on Open Source.
  You are closer to a usable product if you use it on Proprietary software.
Not tested in court... (Score:5, Informative)

by Local ID10T ( 790134 ) writes: <ID10T.L.USER@gmail.com> on Wednesday April 01, 2026 @05:24PM (#66072900) Homepage

"legally distinct code with corporate-friendly licensing. No attribution. No copyleft. No problems."
They can claim that it is legally distinct, but until they win the lawsuit and appeals to set a legal precedent -it is not safe to make the assumption.

- Re: (Score:2)
  
  by WorBlux ( 1751716 ) writes:
  
  Because it's AI, there's not copyright either. Honestly a fair trade, especially if AI can clone proprietary programs just as easily.
  - Re: (Score:2)
    
    by Ksevio ( 865461 ) writes:
    
    That's kind of a weird concept though - Can you remove a license from code by just passing it through an LLM?
- Owner must prove its a derivative (Score:3)
  
  by drnb ( 2434720 ) writes:
  
  "legally distinct code with corporate-friendly licensing. No attribution. No copyleft. No problems."
  They can claim that it is legally distinct, but until they win the lawsuit and appeals to set a legal precedent -it is not safe to make the assumption.
  The above is subject to misinterpretation. The copyright owner must demonstrate its a derivative and win in court. Owner must prove guilt, publisher does not need to prove innocence.
Not limited to open source. (Score:4, Interesting)

by Gravis Zero ( 934156 ) writes: on Wednesday April 01, 2026 @05:27PM (#66072912)

Despite the open source spin, source code is not required to do this as source code can also be generated from binaries. It shouldn't be shocking by now to learn that you can fully automate breaking down executable into functional source code with the addition of AI to "make sense" of the generated code. As such, this means that even large sophisticated and complex programs are also targets.
The real question is, who wants to deal with a massive amount of AI slop code?

- Re: (Score:2, Insightful)
  
  by gweihir ( 88907 ) writes:
  
  While true, legally it makes no difference whether you steal the sources or the binary. It is still stolen. And a clean-room implementation requires the code-writers to never have seen the original in any form. You cannot have an engineer analyze the original and then write a copy. Hence it is immediately plausible that having an AI train on the original or ingest it in a query and then writing a new version is not a clean-room clone (and only those are legal per default) at all.
  I do agree on the slop.
  - Re: (Score:2)
    
    by Gravis Zero ( 934156 ) writes:
    
    While true, legally it makes no difference whether you steal the sources or the binary. It is still stolen.
    Why would you steal it if you can simply license it?
    And a clean-room implementation requires the code-writers to never have seen the original in any form. You cannot have an engineer analyze the original and then write a copy.
    As the article explains, it's a clean room implementation because you use two different instances of an AI.
    * AI 1 documents how the code works in a human readable descriptions. (i.e. does the reverse engineering)
    * AI 2 constructs an entirely new codebase from the human readable descriptions in the documentation. (i.e. does the forward engineering)
    Since AI 2 has never seen or analyzed the original code/binary and has only ever read the documentation about i
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      You not only need to do a clean-room, you need to be able to prove it and that set-up does not allow it.
      - You sure about that? (Score:2)
        
        by Gravis Zero ( 934156 ) writes:
        
        You not only need to do a clean-room, you need to be able to prove it and that set-up does not allow it.
        Actually, this setup would be more provable than having two people do it because you can literally record the entire process from start to finish.
        Have two different computers with no hardware in common.
        Computer 1 interprets the program and generates the documentation, saving it to a USB drive.
        You unplug the USB drive and move it over to Computer 2.
        Computer 2 reads the documentation and generates a new code base.
        You can read the documentation and there was no other means of communication.
        If you don't think a
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        You cannot do this with two people. Not possible. You need at the very least three (bun in practice a lot more), and the middle one is the info barrier. The middle makes sure no details about the implementation leaks and the middle team is under oath and likely needs to be more technologically competent than analysis and implementation team and also needs to be legally competent.
        And that is why you cannot automatize this process.
Please start w/ ReactOS (Score:3)

by unixisc ( 2429386 ) writes: on Wednesday April 01, 2026 @05:30PM (#66072922)

That way, make that OS at least a stable, working one, instead of the alpha stage that it's been in for 30 years

- Re:Please start w/ ReactOS (Score:5, Insightful)
  
  by organgtool ( 966989 ) writes: on Wednesday April 01, 2026 @05:41PM (#66072946)
  
  How do you expect ReactOS to be stable and working when the purpose is to achieve feature-parity with Windows?
  
  - Re: (Score:2)
    
    by rsilvergun ( 571051 ) writes:
    
    Just make sure you are one major release behind Microsoft at all times.
    
    Windows 95? Garbage. 98 especially SE? Just fine. Windows millennium was hot trash and XP especially service pack 2 just fine. Vista garbage, 7 just fine. 8 was a dumpster fire. 10 was okay. And now 11 is right back where we started.
    
    Although honestly I don't know if we will ever get a working Windows 12. Microsoft has very very little competition anymore. Basically just Apple and there's a laundry list of reasons why that's a pr
I Always Could Do That. (Score:4, Funny)

by SlashbotAgent ( 6477336 ) writes: on Wednesday April 01, 2026 @05:48PM (#66072954)

git clone https://github.com/YourStuff/A... [github.com]
But, when I put my name on it everybody gets all pissy.

- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  But, when I put my name on it everybody gets all pissy.
  Yeah, no idea why that happens...
No (Score:3)

by PPH ( 736903 ) writes: on Wednesday April 01, 2026 @06:01PM (#66072970)

No attribution. No copyleft. No problems.
No copyright either if it's AI generated. So, no "corporate-friendly licensing".

- Re: (Score:2)
  
  by n2hightech ( 1170183 ) writes:
  
  Yes this is my understanding as well. This has been tested in court. AI generated materials cannot be copyrighted. Legal Precedents: Cases such as Thaler v. Perlmutter have cemented that AI systems cannot be listed as authors.
Never have seen OG Source Code is a pre-requisite (Score:5, Insightful)

by williamyf ( 227051 ) writes: on Wednesday April 01, 2026 @06:04PM (#66072972)

for clean room implementations.
If the AI model was trained using the OG software project that is being replicated, they are screwed.
That should be very easy to see, in the discovery phase just ask for a list of all the software that was used to train the AI model. IS a yes/no answer, if the AI saw the OG software, then there was no clean room, the room was dirty, very, very dirty

- Re: (Score:2)
  
  by Locke2005 ( 849178 ) writes:
  
  Thank you, that's exactly what I was trying to day. If it is cloning anything it was trained on, then it is definitely NOT a clean room design, and therefore infringes on the copyright!
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Indeed. The beauty of this April fools is that it has a high level of credibility ... to the stupid.
- Re: (Score:2)
  
  by Cyryathorn ( 6591 ) writes:
  
  It seems like this is a problem easily rectified -- someone could prepare a model where any code with a viral license is filtered from the training set.
Clone propriety software too? (Score:3)

by gurps_npc ( 621217 ) writes: on Wednesday April 01, 2026 @06:05PM (#66072976) Homepage

If the AI can clone free software, then it should be able to clone non-free software. The real question is whether we should bother copyrighting any software if it can be so easily duplicated.
Nobody is going to copyright my voice singing - there are so many other better singers.
If software becomes so easy to create than it loses it's value.
Hm - perhaps someone should clone all the software we install on items we buy that comes with licenses that prevent repair. We own the hardware and we usually hate the software that comes on smart appliances. A cheap replacement for it may screw with their illegal and unethical attempts to control what you do with stuff you own and they do not.

- Re: (Score:2)
  
  by Locke2005 ( 849178 ) writes:
  
  It has access to the open source source. It doesn't have access to the non-free source. Of course, anything it clones by virtue of being trained on what it's cloning IS copyright infringement! This might create the case that breaks the back of AI-generated content. It's all infringing on what it was trained with!
  - Re: (Score:2)
    
    by Holi ( 250190 ) writes:
    
    If it is using the source to clone, it is not clean room clone and thus violates copyright.
Huh? (Score:2)

by Locke2005 ( 849178 ) writes:

If it's trained on the actual open source software it's cloning, then it isn't a "clean room design", is it?
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Not at all. If somebody actually took this serious, they would likely be in a world of legal hurt. But remember the date...
Well since AI created code can't be copyrighted... (Score:2)

by jvkjvk ( 102057 ) writes:

I guess that the "proprietary code" isn't so proprietary at all.
Is it really? (Score:3)

by Princeofcups ( 150855 ) writes: <john@princeofcups.com> on Wednesday April 01, 2026 @06:41PM (#66073032) Homepage

So it's slow as fuck, with memory leaks, impossible to maintain, lacking comments, nasty race conditions, 10 times bigger than the original, uses 10 times the memory, freeze trying to open files.... you know, the coding stuff.
Let me know when we can see some head to head QA. Hey, maybe we are there. But I've not seen anything more than vague "proofs of concept." I still want to see AI produce microcode for a new undocumented chop/board. Do you read it the API like a nursery rhyme?
Or to put it another way, if it relies on samples of code to exploit, how is it going to produce NEW code?

- Re: (Score:2)
  
  by Tony Isaac ( 1301187 ) writes:
  
  You haven't actually seen AI-generated code, have you!
  These days, AI generates code that is readable and has meaningful comments (well, as meaningful as most comments written by human coders anyway). AI tends to be good about properly structuring code to eliminate memory leaks. The code isn't necessarily blazing fast, but it seems to compare with typical human-written code. And if you ask it to improve the performance, it often can do so successfully.
  The AI code you are describing is so 2025.
- Re: Is it really? (Score:2)
  
  by topham ( 32406 ) writes:
  
  Ignorance is a state of bliss for you.
  AI readily generates code that exceeds Jr level programmer code by a wide margin.
  It can also produce utter garbage.
  But it is in no way equivalent to monkeys on a keyboard. Its success rate is significantly higher than that.
On the flip side, AI can make properietary OSS (Score:2)

by Sethra ( 55187 ) writes:

For all the open source projects that were turned into commercial versions by introducing proprietary elements, it seems AI can be used to replicate the proprietary components back as open source.
AI Lawlessness must stop! (Score:2)

by BrendaEM ( 871664 ) writes:

There is no reason why billionaires should have a separate lawset.
Github (Score:2)

by lskovlund ( 469142 ) writes:

There's a copy of the old Windows XP leak out there on Github. So an AI trained on the Github repos is liable to have been trained on Windows XP code. And that is ultimately a liability for people who use this tech.
Well... (Score:3)

by Sean Clifford ( 322444 ) writes: on Wednesday April 01, 2026 @09:57PM (#66073272) Journal

Well, Hell, *I* can clone OSS in seconds via a pull. Jeebus. AI blah blah blah AI staff cuts blah blah blah paradigm shift....yawn.

Only a software person would name it that (Score:2)

by fpp ( 614761 ) writes:

Only a software person could come up with a name like "malus.sh".
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  If they write it on April 1st? Sure.
obviously not clean-room (Score:2)

by jizmonkey ( 594430 ) writes:

"Clean-room" means you have one group of engineers study existing code and create a specification and then another group of engineers takes that specification and writes new code that does what the original code did. This is because copyright protects expression, not ideas, AND that independent creation of the same expression is not infringement either.
If you have the same person reading the old code and writing the new code, then to whatever extent the expression is similar there is no protection under cop
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  No, this is definitely not "clean room". An actual clean room clone requires a very competent analysis team that writes a spec. A ton of legal people that verify the spec does not contain descriptions of the original code and that can attest so under oath. And an implementation team that has never seen the original code and only gets said spec. It is a huge and very fragile undertaking. An AI that may have seen the code does not cut it in any way.
  But remember the date. This is a really good April's fool's
So can it clone Word or Excel? (Score:2)

by bobm ( 53783 ) writes:

If it can clone open source then it should be able to clone closed source applications. Unless it's just taking the existing code and re-formatting it.
- Re: So can it clone Word or Excel? (Score:2)
  
  by madbrain ( 11432 ) writes:
  
  There is this thint called reverse engineering. I have found AI to be surprisingly helpful at it with binary payloads from various devices. I'm sure it could reverse engineer compiled binaries, too. Whether that's legal is another issue. But when has the law ever stopped AI companies?
Some small problems (Score:3)

by gweihir ( 88907 ) writes: on Thursday April 02, 2026 @02:36AM (#66073506)

And that is what makes this satirical: The result has no new copyright whatsoever and it only give the appearance of working. It is also unclear whether it is actually legal to do or whether it may remain partially or fully under the original copyright and ownership due to the model probably having being trained on the original OSS code.
As some people will probably take this seriously, it bears pointing out that this is a technological and legal nightmare. It is a very cool satirical project though.

AI generated = not copyrightable (Score:3)

by TrueJim ( 107565 ) writes: on Thursday April 02, 2026 @03:30AM (#66073572) Homepage

If the clone is AI-generated, I donâ(TM)t think it can be copyrighted, based on [Thaler v. Perlmutter, 2023]. Calling the clone âoeproprietaryâ is a slight misstatement. It could be protected as trade secret maybe, but I donâ(TM)t think itâ(TM)s copyrightable, based on what courts have ruled so far.

Clean room is a legal strategy, not a requirement (Score:2)

by topham ( 32406 ) writes:

Clean room design is a legal strategy, but it is not a legal requirement. There are other methods that can be used for creating works not considered to being a derivative work.
Also a reminder, words used in law don't have the same meaning in language. The law usually narrows the meaning explicitly, or implicitly via case law.
GNU has used this to their advantage to clone most of the shell runtime utilities, so why shouldn't the same be used to replace GNU licensed code?
- Re: (Score:2)
  
  by unixisc ( 2429386 ) writes:
  
  We're talking about what's technologically possible. AI can easily ignore the license and do whatever it likes
- Re: (Score:3)
  
  by jiriw ( 444695 ) writes:
  
  Does this mean if an AI ingests any GPL related code - at all, and we can somehow prove it, that all code generated by it MUST be licensed GPL? Afaik, any use of (non-L)GPL code in your project requires you to open the code you add to it as GPL as well. The logical conclusion would be that any AI ingesting a piece of GPL code and generating (vibing or otherwise) code (without providing proper references so you can prove no contamination), means that code MUST be GPL'd when products of that code are publicly
  - Re: (Score:2)
    
    by pak9rabid ( 1011935 ) writes:
    
    Afaik, any use of (non-L)GPL code in your project requires you to open the code you add to it as GPL as well
    This is only true if the resulting project is distributed. Software-as-a-Service that contains GPL'd code is not required to be GPL'd.
  - Re:really? (Score:5, Interesting)
    
    by Sloppy ( 14984 ) writes: on Wednesday April 01, 2026 @06:05PM (#66072974) Homepage Journal
    
    If a computer program ingests code (whether GPL or not) and then outputs some code, the big question is whether or not the resulting code is a derived work.
    If it's not a derived work, then the license of the original code is irrelevant, and it doesn't matter if it's GPLed, fully proprietary, or somewhere in between. The license has no say in the matter, because nobody ever needs to agree to the license; whatever they're doing is legal under copyright law so they already had all the permission they needed, without ever needing the additional rights granted by a license.
    If it is a derived work, then that's copyright infringement unless the person who does it has permission. And the only way to get permission (i.e. cause copyright infringement to have not happened) is to agree to the license. So yes, the output would have to be GPLed.
    But I don't think we really know whether or not robots reading code and then writing code from what they "learned," are creating derived works. Ask again in a few years, after a few court cases. This is hard. Rational people can disagree and come up with pretty good arguments no matter what side they're on. We'll see what the courts decide.
    I think the most interesting case for determining it, won't involve a GPLed input. It'll be if Anthropic sues this project [github.com], since they will have contributed arguments to both sides. They'll have to argue "it is a derived work" in court, but to all their customers, they have and will continue to preach "it's not a derived work."
    
    - Re: (Score:2)
      
      by drnb ( 2434720 ) writes:
      
      If a computer program ingests code (whether GPL or not) and then outputs some code, the big question is whether or not the resulting code is a derived work.
      Not really, that is what a compiler does. It is a derivative.
      
      To legally clone something you generally don't go near the original code. You do a clean room design where someone writes a spec defining functionality and technical details like file formats. Then a second person writes new code that implements that spec. This second person is not involved in the research of the first person, nor connected to the original source code in any way.
      - Re: (Score:2)
        
        by Sloppy ( 14984 ) writes:
        
        That's generally how it's being done. The robot reads the code and writes specs. Then another robot reads the specs and writes code. If courts still accept the traditional clean room defense (and why wouldn't they?) then they're probably going to say it isn't a derived work.
        It looks like the big catch, the actual source of uncertainty, is that the instance of the robot that reads the specs and writes code, may have seen the original code as part of its training data. That'll be enough to keep it from being
- Re: (Score:3)
  
  by Himmy32 ( 650060 ) writes:
  
  That theory will sure get tested with the next day Claude Code [github.com] clone that used OpenAI's Codex to port the leaked Anthropic code to Python.
- Re: really? (Score:2)
  
  by AvitarX ( 172628 ) writes:
  
  It seems to me if one AI writes documentation and then another that never saw the code in question writes code it'd be legit.
  How is that illegal?
  - Re: really? (Score:2)
    
    by zmollusc ( 763634 ) writes:
    
    The training data used to build the second AI probably includes every scrap of source code available, both legally and illegally and may well include several copies and versions of the open source program it is later asked to build 'just from the specs provided by the first AI'.
- Re: (Score:2)
  
  by Pascoea ( 968200 ) writes:
  
  I feel like there hasn't been a good Slashdot April 1st joke in years.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Can AI clone lawyers & judges? (Score:5, Funny)

Re: (Score:3, Interesting)

Re: Can AI clone lawyers & judges? (Score:2)

Re: (Score:2)

Re: Can AI clone lawyers & judges? (Score:2)

Re:Can AI clone lawyers & judges? (Score:5, Insightful)

Re: (Score:2)

Re: Can AI clone lawyers & judges? (Score:5, Informative)

Re: (Score:2)

Re: Can AI clone lawyers & judges? (Score:2)

Re: (Score:2)

Re: (Score:2)

No memory of first AI session, like clean room (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Its about use, not where you got it (Score:3)

Re: (Score:2)

Re: Can AI clone lawyers & judges? (Score:2)

Re: (Score:2)

Re: (Score:2)

Clean room? (Score:5, Insightful)

Re: (Score:3)

Re: Clean room? (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Clean room? (Score:5, Interesting)

CS textbooks that offer snippets of Linux code (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Software Cloning (Score:5, Interesting)

Re: (Score:2)

PowerPC (Score:2)

Re: (Score:2)

Re: Software Cloning (Score:2)

Trade secrets can be lost buy leaks (Score:2)

Re: (Score:2)

Not tested in court... (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Owner must prove its a derivative (Score:3)

Not limited to open source. (Score:4, Interesting)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: (Score:2)

You sure about that? (Score:2)

Re: (Score:2)

Please start w/ ReactOS (Score:3)

Re:Please start w/ ReactOS (Score:5, Insightful)

Re: (Score:2)

I Always Could Do That. (Score:4, Funny)

Re: (Score:2)

No (Score:3)

Re: (Score:2)

Never have seen OG Source Code is a pre-requisite (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Clone propriety software too? (Score:3)

Re: (Score:2)

Re: (Score:2)

Huh? (Score:2)

Re: (Score:2)

Well since AI created code can't be copyrighted... (Score:2)

Is it really? (Score:3)

Re: (Score:2)

Re: Is it really? (Score:2)

On the flip side, AI can make properietary OSS (Score:2)

AI Lawlessness must stop! (Score:2)

Github (Score:2)