---
layout: post
title: "AI Clones and the Explosion of Libre Software"
date: 2026-04-07
---

What happens when the economic cost to recreate software goes from hundreds of 
thousands of dollars to less than a hundred? Now that AI can clone software cheaply, Cloudflare
rewrote Next.js, Anthropic built a C compiler from scratch, a library maintainer
relicensed his project through an AI rewrite, and Claude Code's source leaked and
got cloned overnight. Each of these is a different problem, but together they
paint a picture that I think the Free Software Foundation should be paying more
attention to.

## Cloudflare vs. Next.js

Cloudflare published
[Vinext](https://blog.cloudflare.com/vinext/), a drop-in
Next.js replacement built on Vite, written by one engineer using AI in one
week. Vercel's CEO called it [intellectually
dishonest](https://x.com/rauchg/status/1893752804211134700). Security
researchers found dozens of vulnerabilities within days.

I think this was the precursor to the [SaaSpocalypse of February
2026](https://www.cnbc.com/2026/02/04/software-stocks-plunge-us-ai-disruption.html),
when ~$285 billion was wiped from software stocks in a single day. People lost
faith in software companies without realizing that it's not just about the code.
It's also about the data, the ecosystem, the users, the stickiness of humans' and businesses' habits.

The [IETF](https://datatracker.ietf.org/doc/html/rfc7282) has this notion of
"we believe in running code and rough consensus." Running code is the part that
AIs can only approximate through iteration: build, test, fix, repeat. If code works
one time, it might be a coincidence. When it runs a thousand times in a hundred
different environments with different library versions, that makes
software robust. That's why Next.js exists in the first place. It's
pervasive, everybody uses it, and that gives it resilience. Even if Cloudflare
can copy all of Next.js into a new codebase, that doesn't mean anybody is going
to use it, because the running code that matters is the code that has survived
contact with the real world.

## Claude's C Compiler

On the other end of the spectrum, Nicholas Carlini at Anthropic published
["Building a C compiler in 2,000
sessions"](https://www.anthropic.com/engineering/building-c-compiler): 16
parallel Claude agents wrote a [100,000-line C compiler in
Rust](https://github.com/anthropics/claudes-c-compiler) that can compile a
bootable Linux kernel, QEMU, FFmpeg, SQLite, Postgres, Redis, and Doom. It
passes 99% of GCC's torture tests. Along the way, the process surfaced real bugs
in existing code through delta debugging with GCC as oracle.

## The Chardet Problem

Maybe inspired by what Claude did with the compiler, the maintainer of
[chardet](https://github.com/chardet/chardet) --- the Python character encoding
detection library --- used Claude to [rewrite its code from
scratch](https://simonwillison.net/2026/Mar/5/chardet/) in five days, passing
the same test suite, and relicensing from LGPL to MIT.

The thing is, this maintainer is not the original author. He has been
maintaining the library for 13 years, and from the very start of his tenure, he
saw demand from downstream adopters for a version without the
[LGPL](https://www.gnu.org/licenses/lgpl-3.0.en.html) license. The LGPL
(Lesser GPL) is a weaker form of copyleft: it requires that modifications to the
library itself stay LGPL, but allows proprietary software to link against it.
It's not the strictest license out there, but it was enough to cause problems.
Guido van Rossum considered chardet for inclusion in the Python standard library
in 2015, but the LGPL was incompatible with the PSF's licensing requirements.
The Apache Software Foundation flagged it too.

It's worth distinguishing this from the [Affero
GPL](https://www.gnu.org/licenses/agpl-3.0.en.html), which is the real nuclear
option in copyleft licensing. The AGPL extends the [four
freedoms](https://www.gnu.org/philosophy/free-sw.en.html) to server-side
software: if you modify the code and serve a system with it, the people you
serve should also be able to see the source code. That's the license Google and
Microsoft truly won't touch with a stick, because it would require sharing their
entire codebase. The LGPL is milder, but even mild copyleft creates friction
when organizations have blanket policies against any GPL-family license.

So the chardet maintainer has been getting requests for years to switch to a
permissive license. He filed an issue trying to do so more than a decade ago.

The rewrite is interesting because it is genuinely new code: not a reproduction,
a whole new take that uses some ML models internally. It passes the same
tests, has the same name (probably not the best move in terms of optics), but
the implementation is completely different. The original author [objected to the
relicensing](https://www.theregister.com/2026/03/06/ai_kills_software_licensing/),
and the term "slopfork" was coined for AI rewrites that shed licensing
obligations.

There is a caveat: because chardet has been open source, it was probably in
the training data. So Claude might have known what it was rewriting. This is not
a completely clean room implementation. The code tells a different story: there are
some analyses of the complexity and names of functions, it's clear that it really
has nothing to do with the original. Also, "clean room" is a test for a
specific legal argument. This can be a copyright violation regardless of the "clean room" test, similar to how "passing the Howey test" doesn't mean that your product is not a security. IANAL, but since it
just behaves the same but the implementation is so different, it's more in line with [Oracle v.
Google](https://en.wikipedia.org/wiki/Google_LLC_v._Oracle_America,_Inc.), where
the Supreme Court ruled that reimplementing an API is fair use.

## Claude Code and Claw Code

This "clean room rewrite" argument was also made by the person who built [Claw
Code](https://cybernews.com/tech/claude-code-leak-spawns-fastest-github-repo/),
a clone of Claude Code. When Anthropic [accidentally shipped a source
map](https://www.axios.com/2026/03/31/anthropic-leaked-source-code-ai) in an
npm package, exposing ~512,000 lines of TypeScript, this developer took the
leaked JavaScript, rewrote it in Python, then rewrote the Python in Rust,
claiming that the Rust version is not a copy of the JavaScript implementation.

But this time they completely miss the point. Copying it in the first place already makes it not a
clean room experiment, regardless of how many intermediate languages you bounce
it through.

The key, though, is the freedom this person has to do this; whether or not it's a good thing for society; and whether or not the law will adapt to the concept of "Software".

## The Dream of Free Software

Cloudflare showed that a single engineer can
rewrite Next.js in a week. The chardet developer showed that a maintainer can
reimplement their own library in five days. We can have a new C compiler for a few thousand dollars. Kyle Daigle (GitHub's COO) on X stated that there were 1 billion commits on GitHub in 2025, and that the current rate is 275 million per week (that's a 3x in four months).

A lot of people on the internet are cloning complex codebases.

I think this is a dream for free software. Free software in its original sense,
with all four freedoms: the freedom to run the software as you wish, the freedom
to inspect and see how it works, the freedom to modify it, and the freedom to
distribute what you make.

What we are seeing is an explosion of these freedoms. The ability to inspect a
piece of source code and rewrite it, or launch a new version, is not something
that has been a given in the past 15 years. The last serious effort to build a
browser from scratch was [Ladybird](https://ladybird.org/), and before that,
probably Chromium forking from WebKit. Source code was hard to make. It was hard
to understand. It was easy to distribute precisely because it was so hard to
make: when someone published a library, everybody was like "oh, thanks, I
definitely wanted this but couldn't find the time or the money to do it myself."
You'd take it even if it wasn't exactly what you wanted, because it was much
better than nothing (and spending hours implementing it).

Now with AI, we enter a version of the world where all of this changes. If
anything, we have the problem of not enough attention to review what the
LLMs are generating. We can inspect how software works, but we'll be
using our agents to do that, not our own eyes.

The Free Software Foundation seems to be taking a (kinda woke) hostile attitude
against AI-generated source code. That's understandable, because the flood of
low-quality contributions is making life harder for open source maintainers.

But they're missing the point: copyleft licenses and the use of copyright to enforce freedom was a hack (a beautiful and excellent hack), but it was not the goal. Where we are going, de facto, is free as in freedom, free as in beer, libre software.
