[ FIG. 01 / FULL AI AUTHORSHIP SHARE ]
% of opened PRs
Agent-written PR share
Monthly share of opened PRs with evidence of full end-to-end AI authorship.
Feb2025
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan2026
Feb
Mar
Apr
month
monthly share
monthly opened PRs
[ FIG. 02 / REVERTS PER 1,000 MERGED PRs ]
author
Revert-rate leaderboard
Lower is better. The dashed line is the human baseline for the same merge window.
reverts per 1,000 merged PRs
below human
human baseline
above human
lower is better
[ FIG. 03 / MEDIAN LOC, SAME DEVELOPERS ]
author
PR size, same developers
Median lines changed for developers who opened both AI-authored and human-written PRs.
median LOC per PR
AI-assisted
human
same developers, April 2026
[ FIG. 04 / REVERT RATE BY PR SIZE ]
reverts per 1,000 merged PRs
Reverts by PR size
Same-developer revert rates grouped by number of files changed.
1 file
2-3 files
4-10 files
10+ files
files changed
AI worse
AI better
human
same developers, April 2026
[ FIG. 05 / FILE CHURN BY AUTHOR AND PR SIZE ]
author
Seven-day file churn
Share of touched files that were edited again within a week, bucketed by PR size.
1 file
2-3 files
4-10 files
10+ files
mean
Codex
5.6%
6.0%
5.1%
5.9%
5.7%
Claude
8.2%
8.2%
8.3%
7.7%
8.1%
Cursor BG
10.3%
6.7%
8.5%
9.6%
8.8%
human
9.9%
9.8%
10.5%
9.8%
10.0%
Devin
14.0%
13.6%
12.9%
13.3%
13.5%
files changed
below human mean
above human mean
human mean 10.0%
[ FIG. 06 / GREPTILE FLAGS PER 10K MERGED LOC ]
Review findings by severity
Greptile issue rates normalized by merged lines of code.
human baseline
rates per 10k merged LOC
[ FIG. 07 / REVIEW CYCLES BY PR SIZE ]
mean review cycles
Review cycles by LOC
Mean number of review cycles to merge, grouped by pull-request size.
< 10
10-49
50-199
200-499
500-999
1000+
LOC in PR
cross-population, all authors
[ FIG. 08 / MEAN REVIEW CYCLES BY AUTHOR ]
author
Cycles by author
A tight axis makes the spread in review cycles visible without exaggerating the table values.
mean review cycles to merge (axis 2.0-2.6)
[ FIG. 09 / FAILURE-PATTERN HEATMAP ]
category
Failure fingerprints
Agent issue rates divided by the human rate, normalized per LOC. Values above 1.0x mean that class of issue appears more often than in human PRs.
Claude
Codex
Devin
Cursor BG
security
sql injection
1.50x
1.25x
0.70x
1.70x
auth bypass
1.50x
1.00x
0.50x
1.67x
IDOR / missing tenant check
1.75x
0.88x
0.69x
1.31x
secret in logs
1.34x
1.34x
0.94x
1.65x
correctness
n+1 query
1.27x
0.64x
0.45x
3.45x
regression / breaks existing
1.25x
1.34x
0.89x
2.37x
off-by-one
1.64x
0.55x
0.64x
2.27x
timezone / date bug
1.48x
0.90x
0.66x
2.09x
env var / config bug
1.45x
1.35x
1.35x
0.95x
housekeeping
test missing
0.96x
1.13x
0.93x
2.37x
dead code
1.14x
0.99x
0.78x
2.05x
stale comment / wrong doc
1.69x
0.38x
0.88x
0.69x
agent
below human rate
above human rate
1.0x = human rate