A statistical study of PRs opened on openclaw/openclaw

PRs will require sender reputation

PR spam today looks like email spam in the early 2000s.

When I first looked at the OpenClaw data, the pattern reminded me of email. In 2000, the ILOVEYOU worm infected 45 million computers in 24 hours because the cost of sending email approached zero and people trusted the platform. As a result, people were receiving a much higher volume of emails, and some of them were malicious. Those same parameters apply to PRs today.

The first fixes are similar: blocklists to manage volume, and confidence-based filters and reputation infrastructure to catch bad actors. Today, whether your email reaches its recipient's inbox comes down to two things: who you are, and your sending history.

Contributors on OpenClaw are already being filtered by their reputation: 8.2% merge rate for first-timers, 10.3% for contributors with 2-5 PRs, 18.6% for 5+.

Mitchell Hashimoto created and maintains Ghostty, one of the most popular open source terminal emulators. As the project gained momentum, people submitted such a high volume of AI-generated PR slop that he needed to limit AI-generated contributions.

A week later, he released a solution: Vouch, a trust management system for open source contributors. Unvouched users can't contribute, and bad actors get explicitly flagged. While Vouch is project-specific for now, Mitchell's vision is for trust decisions to eventually ripple across projects that share similar values. Vouch is the open source equivalent of a sender reputation score. (Worth noting: while Vouch was working well for Ghostty, Mitchell decided to take Ghostty off GitHub.)

More contributors won't help if they all think the same way

Linus Torvalds has a famous line: "Given enough eyeballs, all bugs are shallow."

Having more eyes on the same problem brings diverse perspectives. Different people use software differently, encounter different bugs, and approach fixes in novel ways.

That rule might not hold when everyone converges on Claude / Codex / Cursor / Devin etc. In OpenClaw:

4 contributors submitted PRs with the exact title "feat(web-search): add SearXNG as a search provider." They were 4 of 10+ people who independently tried to add the same feature.
6 people independently fixed the same Brave Search locale bug. 2 submitted PRs with identical titles 94 minutes apart.
5 people independently found the same timeout deadlock in the agent runner.

There are more eyes on OpenClaw than ever, but their perspectives are also being filtered by AI coding agents. If most contributors use the same AI coding agents with the same prompts, then their contributions will resemble each other as well.

The promise and advantage of open source has been diversity of thought. Linus's law only holds if the underlying thinking remains diverse too. A contributor who really studies a codebase will prompt differently than one who doesn't.

What's actually getting merged

In the OpenClaw PR data, features have a 9% merge rate, while refactors merge at 35%.

The contributions requiring a deep understanding of the existing codebase outperform novel feature contributions by nearly 4x. It's the common adage these days; the thinking matters a lot more than the typing. The data backs it up.

For example, the way claude-mem maps Claude Code's hook-captured tool stream into its own resumable Agent SDK observer session is a non-obvious architectural choice that requires a deep understanding of both systems. A software developer who understood this decision would be able to distill it into a checklist, which would become the prompt that makes the agent's output significantly better. An agent prompted to "build a memory system" wouldn't be able to achieve that on its own.

Until 200 years ago, the people who designed buildings also constructed them. They were known as master builders. As construction advanced, that role split into two crafts: architecture and construction. The analogy to software isn't clean. Architects still need to know how buildings stand up. But it points at something real: the contributions that survive review are increasingly the ones an agent can't do alone, the calls that require deep understanding of an existing system, not novel construction.

So, what's next?

OpenClaw went from nothing to a real world Jarvis in a few short months. One person, along with a strong community, was able to build at a pace that wasn't possible a year ago. That's pretty special.

The open source community can build faster than ever. The problems introduced by this speed will need better primitives in identity, reputation, and how we validate contributions, which will all be built. Open source has solved harder problems before.

Keep Reading

01

tools

What Developers Need to Know About AI Code Reviews

What we learned from reviewing 700K+ pull requests per month: why AI-generated code needs an independent reviewer, how confidence scores change PR triage, and what real catches look like.

Jun 16, 2025

02

engineering

AI Code Review: Should the Author Be The Reviewer?

AI reviewing AI code? Exploring the paradox of using AI to review AI-generated code & whether this creates conflict or genius.

May 1, 2025

03

product

Frontier Code Review Accuracy at Lower Cost with NVIDIA Nemotron 3 Ultra

Greptile gives 22,000 engineering teams senior-level code review on every pull request. Here is the multi-model architecture behind it, and what we found when we put NVIDIA Nemotron Ultra to the test.

Jun 4, 2026

See Greptile in action

Book Demo Start now