> All code must be compatible with GPL-2.0-only How can you guarantee that will ...

philipov · 2026-04-10T20:21:08 1775852468

You take responsibility. That means if the AI messes up, you get punished. No pushing blame onto the stupid computer. If you're not comfortable with that, don't use the AI.

sarchertech · 2026-04-10T20:25:30 1775852730

There’s no reasonable way for you to use AI generated code and guarantee it doesn’t infringe.

The whole use it but if it behaves as expected, it’s your fault is a ridiculous stance.

philipov · 2026-04-10T20:28:44 1775852924

If you think it's an unacceptable risk to use a tool you can't trust when your own head is on the line, you're right, and you shouldn't use it. You don't have to guarantee anything. You just have to accept punishment.

sarchertech · 2026-04-10T20:38:49 1775853529

That’s just it though it’s not just your head. The liability could very likely also fall on the Linux foundation.

You can’t say “you can do this thing that we know will cause problems that you have no way to mitigate, but if it does we’re not liable”. The infringement was a foreseeable consequence of the policy.

philipov · 2026-04-10T21:59:43 1775858383

This policy effectively punts on the question of what tools were used to create the contribution, and states that regardless of how the code was made, only humans may be considered authors.

From the foundation's point of view, humans are just as capable of submitting infringing code as AI is. If your argument is sound, then how can Linux accept contributors at all?

EDIT: To answer my own question:

    Instead of a signed legal contract, a DCO is an affirmation that a certain person confirms that it is (s)he who holds legal liability for the act of sending of the code, that makes it easier to shift liability to the sender of the code in the case of any legal litigation, which serves as a deterrent of sending any code that can cause legal issues.

This is how the Foundation protects itself, and the policy is that a contribution must have a human as the person who will accept the liability if the foundation comes under fire. The effectiveness of this policy (or not) doesn't depend on how the code was created.

sarchertech · 2026-04-11T00:57:13 1775869033

Anyone distributing copyrighted material can be liable that DCO isn’t going to stop anyone.

If that worked any corporation that wanted to use code they legally couldn’t could just use a fork from someone who assumed responsibility and worst case they’d have to stop using it if someone found out.

testing22321 · 2026-04-11T03:20:36 1775877636

> liability could very likely also fall on the Linux foundation.

It’s just the same as if I copy-paste proprietary code into the kernel and lie about it being GPL.

Is the Linux foundation liable there?

sarchertech · 2026-04-11T18:39:45 1775932785

Maybe. DCOs haven’t been tested. But you can at least say that the person who did this committed fraud and that you had no reasonable way to know they would do that.

LLMs can and do regurgitate code without the user’s knowledge. That’s the problem, the user has no way to mitigate against it. You’re telling contributors “use this thing that has a random chance of creating infringing code”. You should have foreseen that would result in infringing code making its way into the kernel.

testing22321 · 2026-04-11T20:13:55 1775938435

If someone sent you some code and said “it’s all good bro, you can put it in the kernel with your name on it”, would you?

If you don’t feel comfortable about where some code has come from, don’t sign your name.

The fact LLMs exist and can generate code doesn’t change how you would behave and sign your name to guarantee something.

sarchertech · 2026-04-12T10:48:11 1775990891

Are you being purposely obtuse?

testing22321 · 2026-04-12T20:02:36 1776024156

Not at all.

Linus and the rules have always been very clear. If you don’t know where code came from, don’t submit it.

sarchertech · 2026-04-13T20:15:14 1776111314

That’s like a speed limit sign that says “whatever speed you think is reasonable” but in small print “as long as that doesn’t exceed 45mph”.

Yes it’s technically correct, but it won’t hold up I court and it’s a ridiculous statement.

What Linus’ statement is actually saying is that: we want to benefit from AI tooling, but we don’t want to accept any liability.

empath75 · 2026-04-10T22:55:08 1775861708

The only lawsuits so far have been over training on open source software. You're inventing a liability problem that essentially does not exist.

sarchertech · 2026-04-11T00:55:51 1775868951

OpenAI and Anthropic added an indemnity clause to their enterprise contracts specifically to cover this scenario because companies wouldn’t adopt otherwise.

streetfighter64 · 2026-04-10T21:06:00 1775855160

Yeah, but that's not a useful thing to do because not everybody thinks about that or considers it a problem. If somebody's careless and contributes copyrighted code, that's a problem for linux too, not only the author.

For comparison, you wouldn't say, "you're free to use a pair of dice to decide what material to build the bridge out of, as long as you take responsibility if it falls down", because then of course somebody would be careless enough to build a bridge that falls down.

Preventing the problem from the beginning is better than ensuring you have somebody to blame for the problem when it happens.

philipov · 2026-04-10T22:22:22 1775859742

It was already necessary to solve the problem of humans contributing infringing code. It was solved by having contributors assume liability with a DCO. The policy being discussed today asserts that, because AI may not be held legally liable for its contributions, AI may not sign a DCO. A human signature is required. This puts the situation back to what it was with human contributors. What you are proposing goes beyond maintaining the status quo.

sarchertech · 2026-04-11T01:01:18 1775869278

It’s not solved. It hasn’t been tested in court to my knowledge and in my opinion is unlikely to hold up to serious challenge. You can be held liable for just distributing copyrighted code even if the whole “the Linux foundation doesn’t own anything” holds up.

jcelerier · 2026-04-11T09:32:21 1775899941

> Preventing the problem from the beginning is better than ensuring you have somebody to blame for the problem when it happens.

that's assuming that the problems and incentives are the same for everyone. Someone whose uncle happens to own a bridge repair company would absolutely be incentivized to say

> "you're free to use a pair of dice to decide what material to build the bridge out of, as long as you take responsibility if it falls down"

streetfighter64 · 2026-04-18T10:35:59 1776508559

Sorry, what's that got to do with anything? Who's the uncle supposed to be here in your analogy? The copyright owner? And so your hypothesis is that somebody's pushing AI in order to sneak copyrighted code into linux in order to sue them later? Seems very far fetched, and besides, why would I care about their incentives? Why would the linux foundation be interested in allowing that to happen?

adikso · 2026-04-10T21:42:32 1775857352

Their position is probably that LLM technology itself does not require training on code with incompatible licenses, and they probably also tend to avoid engaging in the philosophical debate over whether LLM-generated output is a derivative copy or an original creation (like how humans produce similar code without copying after being exposed to code). I think that even if they view it as derivative, they're being pragmatic - they don't want to block LLM use across the board, since in principle you can train on properly licensed, GPL-compatible data.

SV_BubbleTime · 2026-04-11T20:09:30 1775938170

>There’s no reasonable way for you to use AI generated code and guarantee it doesn’t infringe.

I guess we’ll need to reevaluate what copy rights mean when derivatives grow on trees?

benatkin · 2026-04-11T06:04:45 1775887485

If they merge it in despite it having the model version in the commit, then they're arguably taking a position on it too - that it's fine to use code from an AI that was trained like that.

newsoftheday · 2026-04-10T20:45:45 1775853945

> That means if the AI messes up

I'm not talking about maintainability or reliability. I'm talking about legal culpability.

tmp10423288442 · 2026-04-10T20:19:36 1775852376

Wait for court cases I suppose - not really Linus Torvalds' job to guess how they'll rule on the copyright of mere training. Presumably having your AI actually consult codebases with incompatible licenses at runtime is more risky.

XYen0n · 2026-04-11T08:35:31 1775896531

Even human developers are unlikely to have only ever seen GPL-2.0-only code.

tmalsburg2 · 2026-04-11T10:18:24 1775902704

Humans will not regurgitate longer segments of code verbatim. Even if we wanted to, we couldn’t do it because our memory doesn’t work that way. LLM on the other hand can totally do that, and there’s nothing you can do to prevent it.

johanyc · 2026-04-11T14:35:46 1775918146

Llm can but do they? Is there any evidence that they spit out a piece of code verbatim without being explicitly prompted to do so? NYT v OpenAI for example, NYT intentionally prompted to circumvent OpenAi's guardrail to show NYT articles

Luker88 · 2026-04-11T08:20:17 1775895617

NIT: All AI code satisfies the GPL license.

Anything generated by an AI is public domain. You can include public domain in your GPL code.

I would urge some stronger requirement with the help of a lawyer. You only need a comment like "completely coded by AI, but 100% reviewed by me" to make that code's license worthless.

The only AI-generated part copyrightable are the ones modified by a human.

I am afraid that this "waters down" the actual licensed code.

...We should start opening issues on "100% vibecoded" projects for relicensing to public domain to raise some awareness to the issue.

manquer · 2026-04-11T14:58:10 1775919490

> Anything new generated by an AI is public domain[1]

Language models do generate character for character existing code on which they are trained on . The training corpus usually contain code which is only source available but is not FOSS licensed .

Generated does not automatically mean novel or new the bar needed for IP.

[1] Even this is not definitely ruled in courts or codified in IP law and treaties yet .