AI Can Clone Open-Source Software In Minutes 125
ZipNada writes: Two software researchers recently demonstrated how modern AI tools can reproduce entire open-source projects, creating proprietary versions that appear both functional and legally distinct. The partly-satirical demonstration shows how quickly artificial intelligence can blur long-standing boundaries between coding innovation, copyright law, and the open-source principles that underpin much of the modern internet.
In their presentation, Dylan Ayrey, founder of Truffle Security, and Mike Nolan, a software architect with the UN Development Program, introduced a tool they call malus.sh. For a small fee, the service can "recreate any open-source project," generating what its website describes as "legally distinct code with corporate-friendly licensing. No attribution. No copyleft. No problems." It's a test case in how intellectual property law -- still rooted in 19th-century precedent -- collides with 21st-century automation. Since the US Supreme Court's Baker v. Selden ruling, copyright has been understood to guard expression, not ideas.
That boundary gave rise to clean-room design, a method by which engineers reverse-engineer systems without accessing the original source code. Phoenix Technologies famously used the technique to build its version of the PC BIOS during the 1980s. Ayrey and Nolan's experiment shows how AI can perform a clean-room process in minutes rather than months. But faster doesn't necessarily mean fair. Traditional clean-room efforts required human teams to document and replicate functionality -- a process that demanded both legal oversight and significant labor. By contrast, an AI-mediated "clean room" can be invoked through a few prompts, raising questions about whether such replication still counts as fair use or independent creation.
In their presentation, Dylan Ayrey, founder of Truffle Security, and Mike Nolan, a software architect with the UN Development Program, introduced a tool they call malus.sh. For a small fee, the service can "recreate any open-source project," generating what its website describes as "legally distinct code with corporate-friendly licensing. No attribution. No copyleft. No problems." It's a test case in how intellectual property law -- still rooted in 19th-century precedent -- collides with 21st-century automation. Since the US Supreme Court's Baker v. Selden ruling, copyright has been understood to guard expression, not ideas.
That boundary gave rise to clean-room design, a method by which engineers reverse-engineer systems without accessing the original source code. Phoenix Technologies famously used the technique to build its version of the PC BIOS during the 1980s. Ayrey and Nolan's experiment shows how AI can perform a clean-room process in minutes rather than months. But faster doesn't necessarily mean fair. Traditional clean-room efforts required human teams to document and replicate functionality -- a process that demanded both legal oversight and significant labor. By contrast, an AI-mediated "clean room" can be invoked through a few prompts, raising questions about whether such replication still counts as fair use or independent creation.
Can AI clone lawyers & judges? (Score:5, Funny)
because it seems there's going to be a lot more IP infringement and it won't be limited to open source
Re: (Score:3, Interesting)
Re: Can AI clone lawyers & judges? (Score:2)
Re: (Score:2)
ChatGpt says that qualifies as sound legal advice, so we're good. Ship it!
I know you're joking, but I really hope that the first court where someone tries that has public cameras. The judge's reaction and response will probably be awesome.
Re: Can AI clone lawyers & judges? (Score:2)
Re:Can AI clone lawyers & judges? (Score:5, Insightful)
Re: (Score:2)
If you can substantially distinguish how tokenized abstraction is any different than natural learning, I'd buy it. But as it's not, I don't think that's a meaningful argument.
Re: Can AI clone lawyers & judges? (Score:5, Informative)
Generally if the implementors have seen the original then it's not clean room.
Re: (Score:2)
Precisely. Even if one session is fed the explicit code and documents it, then the second session generates code ostensibly based on the documentation generated by the first without having been fed the original code explicitly, the AI underlying both sessions was itself trained on the original code, even if a previous version of it, and holds large chunks of it lossy-compressed within its internal weights, to the point that, with the proper prompting in an entirely unrelated third session, we can get it to
Re: Can AI clone lawyers & judges? (Score:2)
I'm skeptical this company is doing it properly (or even has their own models), but I think you could do this with two models.
The documenter is trained on all available data.
The coder is trained but without any copy left code.
Clean room reverse engineering actually seems like a place where AI will be extremely capable.
Re: (Score:2)
The coder is trained but without any copy left code.
It costs 10s of millions of dollars to train a big competent LLM. GPT-4 cost ~$74M to train, for example. You can hire a team of human devs who have never looked at the source to do a clean room rewrite the project for a fraction of the cost it would take to develop a "clean" model.
That said, I could see a use for a model that was only trained on MIT-licensed or public domain code.
Re: (Score:2)
So have one AI map out how something works, and have another AI recreate it?
Of cos if the AI was trained on the original.......
No memory of first AI session, like clean room (Score:2)
Of course, I doubt I would call it a clean room design, especially if the AI was trained with that open source project. Once it has seen that original code in it's training, it's quite difficult to convince me that it didn't rely on that code in any way.
An AI session does not necessarily train the AI. Two different AI sessions can be like two different people. The second session won't have any memory of the first. So it is effectively like one person writing a clean spec and a second implementing that spec.
Re: (Score:3)
An AI session does not necessarily train the AI. ... The second session won't have any memory of the first. So it is effectively like one person writing a clean spec
I think he is talking about the actual training of the model, not the inference sessions. LLMs are trained on a LOT of open-source code.
The big models have read countless terrabytes of copyrighted code (Including GPL) in training. So if you ever ask AI to write a clone of GNU software, it might be legal, but can never be truly clean-room.
Re: (Score:3)
It does indeed in no way fulfill the requirements for a "clean room" clone. And, worse, even if it did, there was no way to prove that. For a proof of a "clean room" reimplementation, you need to demonstrate conclusively that the implementors never came into contact with the original code.
Also remember that the result has no now copyright and ownership. It either has none at all or retains the original one.
Re: (Score:2)
That may be the case in the US but not in Europe. There we have interop privileges in the EU software copyright directive.
This is why your the steward organisation shall be based in Europe or any other place that grants you legal satefy.
To satisfy US requirements the trick is usually to separate research from implementation.
that is, you have one project that documents functionality, and another independent project that implements the spec.
Adversarial interoperability of course needs to get stengthened.
Re: (Score:2)
Interop allows reverse engineering of data, not of code.
Re: (Score:2)
Once it has seen that original code in it's training, it's quite difficult to convince me that it didn't rely on that code in any way.
The problem with this is a question of learning vs copying. From a copyright point of view virtually all the great artists of history studied under someone else. Should the Cristine Chapel painting be attributed to Domenico Ghirlandaio instead of Michelangelo simply because the latter learned and studied under the former? It will most definitely have the former's influence just like everything has influence in life.
Re: (Score:2)
This really does not matter. Copyright protects only the expession of software. You are free to clone software with independent code.
Re: (Score:2)
If the problem is that the original is in the training data (and memorized) then you should be able to find recognizable parts in the output.
The point here is more, that you have really fast clean-room reverse engineering, that produces *different* code with the same functionality.
The next step to make it bullet proof would be to use two distinct models for writing spec and implementing spec.
The software world is changing, because automated code writing is becoming good. It is interesting where things will
Its about use, not where you got it (Score:3)
From their own website... They offer "Full legal indemnification* *Through our offshore subsidiary in a jurisdiction that doesn't recognize software copyright" So, apparently, problem solved ;-D
I don't think it works that way. If I'm using copyrighted software without a license, the source of that software does not change the fact that I am in violation. It's about use, not where you got it.
Re: (Score:2)
because it seems there's going to be a lot more IP infringement and it won't be limited to open source
It'll get real interesting when we start finding out the actual Intellectual in all those Property claims, is AI.
Re: Can AI clone lawyers & judges? (Score:2)
Re: (Score:2)
I certainly hope so! At least the lawyers!
If AI is here to replace human labor, lawyers are very, very ripe for cloning. What they do is mostly just filling in blanks and summarizing documents already. AI is really, really good at that kind of thing. And it would be a lot cheaper to use AI than to hire a lawyer.
Re: (Score:2)
I wonder if it's already happened. I can imagine a law firm creating an LLM based on the writings and court transcripts of a particular judge, and then using it to test out different strategies to see which is likely to be most effective with them.
Clean room? (Score:5, Insightful)
Even if you use an AI to extract an extremely condensed specification out of the source code, it's hardly clean room if the LLM was pre-trained on the source code any way.
Re: (Score:3)
Re: Clean room? (Score:2)
With open source you can document a lot more than with closed source.
Re: (Score:2)
Assuming the hype is real, I wonder if it's possible to train these algorithms on machine code, or if they need the semantics and expressiveness of a HLL.
Re: (Score:2)
If you write the spec, it can do. The point here is, that the first step is to convert code into a well-written specification. With closed source you need a human to write the specification (or a model working a lot longer using screenshot tools and what ever. I bet we will see such systems in the future).
Re: (Score:2)
Re: (Score:2)
Re:Clean room? (Score:5, Interesting)
Even if you use an AI to extract an extremely condensed specification out of the source code, it's hardly clean room if the LLM was pre-trained on the source code any way.
I once worked at a place that had a clean room process to create code compatible with a proprietary product. Anybody who had ever seen the original code or even loaded the original binary into a debugger was not allowed to write any code at all for the cloned product. The clone writers generally worked only off of the specifications and user documentation.
There were a handful of people who were allowed to debug the original to resolve a few questions about low-level compatibility. The only way they were allowed to communicate with the software writers was through written questions and answers that left a clear paper trail, and the answers had to be as terse as possible (usually just yes or no). Everyone knew that these memos were highly likely to be used as evidence in legal proceedings.
I highly doubt that any AI tech bros have ever been this rigorous, and I'd bet that most of these AIs have been trained on the exact same source code that they are cloning.
CS textbooks that offer snippets of Linux code (Score:2)
I'd bet that most of these AIs have been trained on the exact same source code that they are cloning.
That may have even happened indirectly. Consider a computer science textbook discussing some classic OS topic and it offers a snippet of Linux code as a sample implementation. I think such exposure would disqualify a person from working on a clean room implementation team.
Re: (Score:2)
Re: (Score:2)
It doesn't have to be "clean room." If the new code is distinctly different from the original, it would be extremely difficult to claim copyright infringement.
It's not illegal to make a Word Processor just because Word is copyrighted.
Re: (Score:3)
It's a question because the AI was trained on the original. Clean Room Design [wikipedia.org] is a well-known engineering technique. It's not a problem when cloning closed source if the AI wasn't trained on that closed source code.
Re: (Score:3)
It's not a problem when cloning closed source if the AI wasn't trained on that closed source code.
Seems like a good test then. If it can do that with closed source, there is no logical reason to assume it could not do the same with open source it had never seen, whether it actually had or not. Spin up another and do it again like that just to be sure if you must, but it's probably a done deal at that point.
Re: (Score:2)
Re: (Score:2)
My point is that all the big models are pretrained on any decent open source project to begin with. That other LLM too.
Software Cloning (Score:5, Interesting)
Can it clone proprietary software and turn it into an open source project?
If so, then I think the tradeoff is fair.
Re: (Score:2)
Maybe have it create and publish the source code of 64-bit Windows 7, w/ no allowances for any assembly language? Then port it to all non-x86 CPUs - RISC-V, Arm, and even legacy NT hardware like Alpha and MIPS
PowerPC (Score:2)
Maybe have it create and publish the source code of 64-bit Windows 7, w/ no allowances for any assembly language? Then port it to all non-x86 CPUs - RISC-V, Arm, and even legacy NT hardware like Alpha and MIPS
And PoewrPC. Damn, PowerPC never gets any respect. ;-)
Re: (Score:2)
So begin the Obfuscated Object Code Compiler wars, to keep robots from writing machine language decompilers. The next few years are gonna be a wild ride!
Re: Software Cloning (Score:2)
Can it clone proprietary software and turn it into an open source project?
I think the answer is no if you don't have clean access to the proprietary software, e.g., if you decompile or reverse engineer it in violation of a license agreement that you agreed to. That taints the spec, which taints the clean room reimplementation. I think this also applies to leaked software - if you know it's someone's trade secret, but you use it anyway to create a competing product, you can be sued.
Trade secrets can be lost buy leaks (Score:2)
I think this also applies to leaked software - if you know it's someone's trade secret, but you use it anyway to create a competing product, you can be sued.
It depends on if you were the person who disclosed or used the trade secret in violation of some sort of non-disclosure or maintenance of IP agreement. However once in the wild, other persons may use the info.
If a trade secret is disclosed then the IP protection is lost. The disclosure of a trade secret does not have to be intentional or authorized.
Re: (Score:2)
Can it clone proprietary software and turn it into an open source project?
If so, then I think the tradeoff is fair.
There is no tradeoff at all. This process takes Open Source code and turns it into unmaintainable gibberish. This same product also takes Proprietary software and turns it into unmaintainable gibberish.
Nothing was taken from Open Source, but something was taken from Proprietary.
You are further from a usable product if you use this on Open Source.
You are closer to a usable product if you use it on Proprietary software.
Not tested in court... (Score:5, Informative)
"legally distinct code with corporate-friendly licensing. No attribution. No copyleft. No problems."
They can claim that it is legally distinct, but until they win the lawsuit and appeals to set a legal precedent -it is not safe to make the assumption.
Re: (Score:2)
Because it's AI, there's not copyright either. Honestly a fair trade, especially if AI can clone proprietary programs just as easily.
Re: (Score:2)
That's kind of a weird concept though - Can you remove a license from code by just passing it through an LLM?
Owner must prove its a derivative (Score:3)
"legally distinct code with corporate-friendly licensing. No attribution. No copyleft. No problems."
They can claim that it is legally distinct, but until they win the lawsuit and appeals to set a legal precedent -it is not safe to make the assumption.
The above is subject to misinterpretation. The copyright owner must demonstrate its a derivative and win in court. Owner must prove guilt, publisher does not need to prove innocence.
Not limited to open source. (Score:4, Interesting)
Despite the open source spin, source code is not required to do this as source code can also be generated from binaries. It shouldn't be shocking by now to learn that you can fully automate breaking down executable into functional source code with the addition of AI to "make sense" of the generated code. As such, this means that even large sophisticated and complex programs are also targets.
The real question is, who wants to deal with a massive amount of AI slop code?
Re: (Score:2, Insightful)
While true, legally it makes no difference whether you steal the sources or the binary. It is still stolen. And a clean-room implementation requires the code-writers to never have seen the original in any form. You cannot have an engineer analyze the original and then write a copy. Hence it is immediately plausible that having an AI train on the original or ingest it in a query and then writing a new version is not a clean-room clone (and only those are legal per default) at all.
I do agree on the slop.
Re: (Score:2)
While true, legally it makes no difference whether you steal the sources or the binary. It is still stolen.
Why would you steal it if you can simply license it?
And a clean-room implementation requires the code-writers to never have seen the original in any form. You cannot have an engineer analyze the original and then write a copy.
As the article explains, it's a clean room implementation because you use two different instances of an AI.
* AI 1 documents how the code works in a human readable descriptions. (i.e. does the reverse engineering)
* AI 2 constructs an entirely new codebase from the human readable descriptions in the documentation. (i.e. does the forward engineering)
Since AI 2 has never seen or analyzed the original code/binary and has only ever read the documentation about i
Re: (Score:2)
You not only need to do a clean-room, you need to be able to prove it and that set-up does not allow it.
You sure about that? (Score:2)
You not only need to do a clean-room, you need to be able to prove it and that set-up does not allow it.
Actually, this setup would be more provable than having two people do it because you can literally record the entire process from start to finish.
Have two different computers with no hardware in common.
Computer 1 interprets the program and generates the documentation, saving it to a USB drive.
You unplug the USB drive and move it over to Computer 2.
Computer 2 reads the documentation and generates a new code base.
You can read the documentation and there was no other means of communication.
If you don't think a
Re: (Score:2)
You cannot do this with two people. Not possible. You need at the very least three (bun in practice a lot more), and the middle one is the info barrier. The middle makes sure no details about the implementation leaks and the middle team is under oath and likely needs to be more technologically competent than analysis and implementation team and also needs to be legally competent.
And that is why you cannot automatize this process.
Please start w/ ReactOS (Score:3)
That way, make that OS at least a stable, working one, instead of the alpha stage that it's been in for 30 years
Re:Please start w/ ReactOS (Score:5, Insightful)
Re: (Score:2)
Windows 95? Garbage. 98 especially SE? Just fine. Windows millennium was hot trash and XP especially service pack 2 just fine. Vista garbage, 7 just fine. 8 was a dumpster fire. 10 was okay. And now 11 is right back where we started.
Although honestly I don't know if we will ever get a working Windows 12. Microsoft has very very little competition anymore. Basically just Apple and there's a laundry list of reasons why that's a pr
I Always Could Do That. (Score:4, Funny)
git clone https://github.com/YourStuff/A... [github.com]
But, when I put my name on it everybody gets all pissy.
Re: (Score:2)
But, when I put my name on it everybody gets all pissy.
Yeah, no idea why that happens...
No (Score:3)
No attribution. No copyleft. No problems.
No copyright either if it's AI generated. So, no "corporate-friendly licensing".
Re: (Score:2)
Never have seen OG Source Code is a pre-requisite (Score:5, Insightful)
for clean room implementations.
If the AI model was trained using the OG software project that is being replicated, they are screwed.
That should be very easy to see, in the discovery phase just ask for a list of all the software that was used to train the AI model. IS a yes/no answer, if the AI saw the OG software, then there was no clean room, the room was dirty, very, very dirty
Re: (Score:2)
Re: (Score:2)
Indeed. The beauty of this April fools is that it has a high level of credibility ... to the stupid.
Re: (Score:2)
It seems like this is a problem easily rectified -- someone could prepare a model where any code with a viral license is filtered from the training set.
Clone propriety software too? (Score:3)
If the AI can clone free software, then it should be able to clone non-free software. The real question is whether we should bother copyrighting any software if it can be so easily duplicated.
Nobody is going to copyright my voice singing - there are so many other better singers.
If software becomes so easy to create than it loses it's value.
Hm - perhaps someone should clone all the software we install on items we buy that comes with licenses that prevent repair. We own the hardware and we usually hate the software that comes on smart appliances. A cheap replacement for it may screw with their illegal and unethical attempts to control what you do with stuff you own and they do not.
Re: (Score:2)
Re: (Score:2)
If it is using the source to clone, it is not clean room clone and thus violates copyright.
Huh? (Score:2)
Re: (Score:2)
Not at all. If somebody actually took this serious, they would likely be in a world of legal hurt. But remember the date...
Well since AI created code can't be copyrighted... (Score:2)
I guess that the "proprietary code" isn't so proprietary at all.
Is it really? (Score:3)
So it's slow as fuck, with memory leaks, impossible to maintain, lacking comments, nasty race conditions, 10 times bigger than the original, uses 10 times the memory, freeze trying to open files.... you know, the coding stuff.
Let me know when we can see some head to head QA. Hey, maybe we are there. But I've not seen anything more than vague "proofs of concept." I still want to see AI produce microcode for a new undocumented chop/board. Do you read it the API like a nursery rhyme?
Or to put it another way, if it relies on samples of code to exploit, how is it going to produce NEW code?
Re: (Score:2)
You haven't actually seen AI-generated code, have you!
These days, AI generates code that is readable and has meaningful comments (well, as meaningful as most comments written by human coders anyway). AI tends to be good about properly structuring code to eliminate memory leaks. The code isn't necessarily blazing fast, but it seems to compare with typical human-written code. And if you ask it to improve the performance, it often can do so successfully.
The AI code you are describing is so 2025.
Re: Is it really? (Score:2)
Ignorance is a state of bliss for you.
AI readily generates code that exceeds Jr level programmer code by a wide margin.
It can also produce utter garbage.
But it is in no way equivalent to monkeys on a keyboard. Its success rate is significantly higher than that.
On the flip side, AI can make properietary OSS (Score:2)
For all the open source projects that were turned into commercial versions by introducing proprietary elements, it seems AI can be used to replicate the proprietary components back as open source.
AI Lawlessness must stop! (Score:2)
Github (Score:2)
Well... (Score:3)
Well, Hell, *I* can clone OSS in seconds via a pull. Jeebus. AI blah blah blah AI staff cuts blah blah blah paradigm shift....yawn.
Only a software person would name it that (Score:2)
Re: (Score:2)
If they write it on April 1st? Sure.
obviously not clean-room (Score:2)
"Clean-room" means you have one group of engineers study existing code and create a specification and then another group of engineers takes that specification and writes new code that does what the original code did. This is because copyright protects expression, not ideas, AND that independent creation of the same expression is not infringement either.
If you have the same person reading the old code and writing the new code, then to whatever extent the expression is similar there is no protection under cop
Re: (Score:2)
No, this is definitely not "clean room". An actual clean room clone requires a very competent analysis team that writes a spec. A ton of legal people that verify the spec does not contain descriptions of the original code and that can attest so under oath. And an implementation team that has never seen the original code and only gets said spec. It is a huge and very fragile undertaking. An AI that may have seen the code does not cut it in any way.
But remember the date. This is a really good April's fool's
So can it clone Word or Excel? (Score:2)
If it can clone open source then it should be able to clone closed source applications. Unless it's just taking the existing code and re-formatting it.
Re: So can it clone Word or Excel? (Score:2)
There is this thint called reverse engineering. I have found AI to be surprisingly helpful at it with binary payloads from various devices. I'm sure it could reverse engineer compiled binaries, too. Whether that's legal is another issue. But when has the law ever stopped AI companies?
Some small problems (Score:3)
And that is what makes this satirical: The result has no new copyright whatsoever and it only give the appearance of working. It is also unclear whether it is actually legal to do or whether it may remain partially or fully under the original copyright and ownership due to the model probably having being trained on the original OSS code.
As some people will probably take this seriously, it bears pointing out that this is a technological and legal nightmare. It is a very cool satirical project though.
AI generated = not copyrightable (Score:3)
If the clone is AI-generated, I donâ(TM)t think it can be copyrighted, based on [Thaler v. Perlmutter, 2023]. Calling the clone âoeproprietaryâ is a slight misstatement. It could be protected as trade secret maybe, but I donâ(TM)t think itâ(TM)s copyrightable, based on what courts have ruled so far.
Clean room is a legal strategy, not a requirement (Score:2)
Clean room design is a legal strategy, but it is not a legal requirement. There are other methods that can be used for creating works not considered to being a derivative work.
Also a reminder, words used in law don't have the same meaning in language. The law usually narrows the meaning explicitly, or implicitly via case law.
GNU has used this to their advantage to clone most of the shell runtime utilities, so why shouldn't the same be used to replace GNU licensed code?
Re: (Score:2)
We're talking about what's technologically possible. AI can easily ignore the license and do whatever it likes
Re: (Score:3)
Does this mean if an AI ingests any GPL related code - at all, and we can somehow prove it, that all code generated by it MUST be licensed GPL? Afaik, any use of (non-L)GPL code in your project requires you to open the code you add to it as GPL as well. The logical conclusion would be that any AI ingesting a piece of GPL code and generating (vibing or otherwise) code (without providing proper references so you can prove no contamination), means that code MUST be GPL'd when products of that code are publicly
Re: (Score:2)
Afaik, any use of (non-L)GPL code in your project requires you to open the code you add to it as GPL as well
This is only true if the resulting project is distributed. Software-as-a-Service that contains GPL'd code is not required to be GPL'd.
Re:really? (Score:5, Interesting)
If a computer program ingests code (whether GPL or not) and then outputs some code, the big question is whether or not the resulting code is a derived work.
If it's not a derived work, then the license of the original code is irrelevant, and it doesn't matter if it's GPLed, fully proprietary, or somewhere in between. The license has no say in the matter, because nobody ever needs to agree to the license; whatever they're doing is legal under copyright law so they already had all the permission they needed, without ever needing the additional rights granted by a license.
If it is a derived work, then that's copyright infringement unless the person who does it has permission. And the only way to get permission (i.e. cause copyright infringement to have not happened) is to agree to the license. So yes, the output would have to be GPLed.
But I don't think we really know whether or not robots reading code and then writing code from what they "learned," are creating derived works. Ask again in a few years, after a few court cases. This is hard. Rational people can disagree and come up with pretty good arguments no matter what side they're on. We'll see what the courts decide.
I think the most interesting case for determining it, won't involve a GPLed input. It'll be if Anthropic sues this project [github.com], since they will have contributed arguments to both sides. They'll have to argue "it is a derived work" in court, but to all their customers, they have and will continue to preach "it's not a derived work."
Re: (Score:2)
If a computer program ingests code (whether GPL or not) and then outputs some code, the big question is whether or not the resulting code is a derived work.
Not really, that is what a compiler does. It is a derivative.
To legally clone something you generally don't go near the original code. You do a clean room design where someone writes a spec defining functionality and technical details like file formats. Then a second person writes new code that implements that spec. This second person is not involved in the research of the first person, nor connected to the original source code in any way.
Re: (Score:2)
That's generally how it's being done. The robot reads the code and writes specs. Then another robot reads the specs and writes code. If courts still accept the traditional clean room defense (and why wouldn't they?) then they're probably going to say it isn't a derived work.
It looks like the big catch, the actual source of uncertainty, is that the instance of the robot that reads the specs and writes code, may have seen the original code as part of its training data. That'll be enough to keep it from being
Re: (Score:3)
Re: really? (Score:2)
It seems to me if one AI writes documentation and then another that never saw the code in question writes code it'd be legit.
How is that illegal?
Re: really? (Score:2)
The training data used to build the second AI probably includes every scrap of source code available, both legally and illegally and may well include several copies and versions of the open source program it is later asked to build 'just from the specs provided by the first AI'.
Re: (Score:2)