Linux lays down law on AI coding

The open-source community’s long-running identity problem over artificial intelligence has received a much-needed dose of pragmatism. This week, the Linux kernel community finally adopted a formal, project-wide policy that explicitly allows AI-assisted code contributions as long as developers adhere to severe new disclosure criteria. The new standards state that AI agents cannot utilize the legally binding “Signed-off-by” tag and must instead use the new “Assisted-by” tag for transparency. Finally, the policy legally anchors every line of AI-generated code, as well as any ensuing defects or security issues, squarely on the shoulders of the human who submits it.

The move follows a turbulent few months in the open-source world, concluding a contentious argument that culminated in January when Intel’s Dave Hansen and Oracle’s Lorenzo Stoakes disagreed on how forcefully the kernel should police AI technologies. Linus Torvalds, in his customary forthright style, eventually shut down the disagreement, labeling the debate about outright bans “pointless posturing.”

Torvalds’ viewpoint, which serves as the intellectual foundation of this new policy, is incredibly simple: AI is just another tool. Bad actors contributing garbage code are unlikely to read the documentation, thus the kernel should focus on keeping human developers accountable rather than attempting to control the software they use on their local workstations. It’s a very fair, realistic approach, especially given the fear that has gripped other parts of the open-source ecosystem.

Until far, significant programs have taken very different approaches to the AI question. Over the last two years, important Linux distributions such as Gentoo and the venerable Unix distribution NetBSD have moved to outright restrict AI-generated submissions. NetBSD maintainers notably classified LLM outcomes as legally “tainted” due to the ambiguous copyright status of the models’ training material.

The Developer Certificate of Origin (DCO) is key to this panic. As Red Hat noted in a thorough review late last year, the DCO requires individuals to legally authenticate their right to submit code. Because LLMs are trained on vast datasets of open-source code, which frequently have restrictive licenses such as the GNU General Public License, developers utilizing Copilot or ChatGPT cannot truly ensure the provenance of what they provide. Red Hat warned that this could inadvertently breach open-source licensing and destroy the DCO framework completely.

Aside from legal issues, project managers have struggled to keep up with the sheer volume. The open-source world is currently submerged in what the community has called “AI slop.” The inventor of cURL had to cancel bug bounties after being inundated with hallucinated code, whiteboard tool tldraw began automatically rejecting external PRs in self-defense, and projects like Node.js and OCaml have seen large, >10,000-line AI-generated changes provoke existential arguments among maintainers.

The cultural friction of secret AI code has become even more explosive. Late last year, Sasha Levin, an NVIDIA engineer and kernel maintainer, drew widespread criticism after it was found that he submitted a patch to kernel 6.15 totally authored by an LLM without disclosing it, including the changelog. While the code was functional, it had a performance regression despite being reviewed and tested. The community was outraged by the concept of developers putting their names on sophisticated code they didn’t write, and Torvalds admitted the patch was not properly evaluated, in part because it was not designated as AI-generated.

The Linux kernel isn’t the only group dealing with the consequences of unreported AI aid. In gaming, the famed (and still very much alive) Doom modding community was shattered last year when Christoph “Graf Zahl” Oelckers, the long-time chief developer of the mega-popular GZDoom source port, was caught using covert AI-generated patches. When community members chastised him for a lack of transparency, Oelckers responded casually, urging them to “feel free to fork the project.” The community called his bluff, resulting in the creation of the new UZDoom source port as the vast majority of GZDoom contributors migrated to the new fork.

The GZDoom incident and the Sasha Levin outcry demonstrate why the Linux kernel’s new policy is so important. The majority of the developer community is less furious about the usage of AI and more frustrated by the dishonesty surrounding it. The Linux kernel attempts to remove emotion from the debate by requiring an assisted-by tag and implementing strict human culpability. Torvalds and the maintainers are accepting reality: developers will use AI tools to code quicker, and banning them is analogous to banning a specific brand of keyboard.

To summarize, if the code is excellent, it is good. If the kernel is broken due to hallucinating AI slop, the human who pressed “submit” will have to answer to Linus Torvalds. That’s about the strongest deterrence you can get in the open-source environment.

Source link