AI Impact Brief
The AI Amplification Effect
What two years of DORA data and five industry reports reveal
PRs merged
delivery stability
Same teams. Same year.
AI is accelerating individual output while organizational outcomes stay flat. The variable that determines which side you land on? The quality of your engineering practices.
The Warning Shot
DORA's 2024 report was the first to measure AI's impact on delivery. The results were not what anyone expected.
When DORA's Accelerate State of DevOps Report landed in 2024, it carried the first large-scale data on AI adoption in software teams—and the findings surprised everyone. AI adoption correlated with worse delivery outcomes, not better.
The pool of high-performing teams—those with the strongest throughput and stability—shrank from 31% to 22%. Meanwhile, the lowest-performing group grew from 17% to 25%. For every 25% increase in AI adoption, the data predicted a 1.5% drop in delivery throughput and a 7.2% drop in delivery stability. The only positive signals were documentation quality (+7.5%) and perceived code quality (+3.4%)—perception, not outcomes.
The mechanism was batch size. As Laura Tacho observed: "AI introduces risk not because of garbage code, but because batch size seems to increase." More code per change means more risk per deployment. And 39% of developers reported low or no trust in AI-generated code.
If AI is making individual developers faster but teams less stable, what exactly is it accelerating?
The Amplification Thesis
DORA dedicated its entire 2025 report to AI's impact. The throughput story reversed. The stability story didn't.
A year later, DORA didn't just update the data—they dedicated the entire 2025 edition to AI's impact on software teams, titling it State of AI-Assisted Software Development. That shift in focus tells you where the industry's center of gravity has moved.
The throughput story reversed: task completion up 21%, PRs merged up 98%. But stability stayed negative—bug rates up 9%, PR size up 154%, review time up 91%. The net organizational impact? Flat.
"AI is an amplifier, not a fixer."
— DORA State of AI-Assisted Software Development, 2025
Teams with strong engineering practices saw AI multiply their effectiveness. Teams with weak practices saw AI multiply their dysfunction. The variable wasn't AI adoption—it was the quality of what AI was amplifying.
The 2025 report also introduced a fifth metric—Rework Rate—measuring the percentage of unplanned deployments to fix user-facing bugs. It captures exactly the hidden cost that throughput metrics miss: shipping faster without shipping better.
Most engineering leaders can tell you their deployment frequency. How many can tell you whether their practices are strong enough for AI to amplify—or weak enough for it to expose?
The Do-More-With-Less Question
If you're running a leaner team, this data isn't abstract. It's your operating reality.
There's a version of "flat output" that's actually a win: same delivery at lower cost. If AI lets a team of 30 do what used to take 40, that's a legitimate efficiency gain. McKinsey reports 10–20% cost reductions in software engineering across their survey. Some organizations are genuinely doing more with less.
But the data suggests most aren't banking those gains. Atlassian found that AI saves developers roughly 10 hours per week—and organizational friction consumes every hour back. The savings aren't landing as efficiency. They're being absorbed by the overhead of working in systems that weren't designed for this pace: longer reviews, bigger PRs, more rework, more context-switching.
For the CTO who's been asked to maintain velocity with a smaller team, this distinction matters. The question isn't whether AI can fill the gap. It's whether your codebase, your testing discipline, and your team's habits can handle the concentration of work that comes with fewer people moving faster. A lean team with strong practices will pull ahead. A lean team amplifying weak practices will hit a wall—and it will look like a people problem when it's actually a structural one.
Fewer engineers means every practice gap costs more. The margin for dysfunction shrinks exactly when the pressure to perform increases.
The Industry Converges
Five major 2025 reports independently arrived at the same conclusion.
Up from 31%. "Almost right" code is the #1 frustration.
AI saves ~10h/week. Org friction loses ~10h/week. Coding is only 16% of dev time.
22% of merged code is AI-authored. "Adoption doesn't equal impact."
80% of new devs use Copilot in first week. Volume ≠ quality.
of organizations qualify as "AI high performers." The rest see 10–20% cost reductions but can't translate them into sustained delivery improvement.
Six independent research teams, hundreds of thousands of data points, the same conclusion: AI is widening the gap between teams that were already strong and teams that weren't. The question for engineering leaders isn't whether to adopt AI. It's whether your engineering practices can survive the amplification.
What It All Adds Up To
The pattern across every report points to the same variable.
Every report tells the same story: AI is accelerating individual output while organizational outcomes stay flat or degrade. The variable that determines which side you land on is the quality of your engineering practices—the structural health of your codebase, your testing discipline, your review processes, your knowledge distribution.
These aren't just technical metrics. They're cultural habits—the daily decisions that compound into how a team actually builds software. Culture eats strategy for breakfast, and right now AI is stress-testing engineering culture at every organization simultaneously.
The data is clear on the problem. What's missing is a way to see where your team's habits actually stand—before AI makes the answer obvious the hard way.
Free 1-hour DX coaching session
A focused session for engineering leaders who want to understand what's actually happening in their codebase. Live walkthrough of what DX Coach reveals about your team's practices, habits, and readiness for AI-assisted development.
- Complete data tables from DORA 2024 and 2025
- Side-by-side comparison of 5 adjacent reports
- DX Coach capability mapping to industry findings
The data: report by report
DORA 2024 — AI Impact on Delivery Metrics
| Metric | Impact per 25% AI Adoption Increase | Direction |
|---|---|---|
| Delivery throughput | -1.5% | Worse |
| Delivery stability | -7.2% | Worse |
| Documentation quality | +7.5% | Better |
| Perceived code quality | +3.4% | Better |
| Code review speed | +3.1% | Better |
| Time on valuable work | -2.6% | Worse |
| Individual productivity | +2.1% | Better |
DORA 2025 — The Amplification Effect
| Metric | AI-Assisted Change | Direction |
|---|---|---|
| Task completion | +21% | Better |
| PRs merged | +98% | Better |
| Bug rates | +9% | Worse |
| PR size | +154% | Worse |
| Review time | +91% | Worse |
| Net organizational impact | Flat | Neutral |
Adjacent Reports — 2025 Convergence
| Report | Sample | Key Finding |
|---|---|---|
| Stack Overflow 2025 | 65K+ devs | 46% distrust AI output (up from 31%). "Almost right" is the #1 frustration. |
| Atlassian DevEx 2025 | 3,500 devs | AI saves ~10h/week. Org friction loses ~10h/week. Net: zero. Coding = 16% of dev time. |
| GetDX Q4 2025 | 135K devs | 91% adoption. 22% of merged code is AI-authored. 3.6h/week saved. "Adoption ≠ impact." |
| GitHub Octoverse 2025 | Platform | +25% commits YoY. +23% PRs merged. 80% of new devs use Copilot in first week. |
| McKinsey State of AI 2025 | Enterprise | Only 6% are "AI high performers." 10–20% cost reductions in software engineering. |
What this means for your codebase
Each industry finding maps to a structural signal in your code. DX Coach measures these signals deterministically—straight from the codebase—so you get an objective layer that complements your surveys, dashboards, and team feedback.
| Industry Finding | Source | DX Coach Dimension | What We Detect |
|---|---|---|---|
| Batch size inflating risk | DORA 2024/25 | Code Health | Complexity hotspots, structural complexity scoring |
| Convention drift accelerating | DORA 2025, SO | Code Health | Convention variance detection, pattern drift scoring |
| Bug rates increasing (+9%) | DORA 2025 | Change Safety | Test quality assertions, change-readiness analysis |
| Review time ballooning (+91%) | DORA 2025 | Change Safety | Code churn hotspots, coupling analysis |
| "Almost right" AI output | SO 2025 | Code Health | Intent duplication, code duplication scanning |
| Knowledge concentration risk | DORA 2025 | Team Resilience | Bus factor analysis, knowledge concentration scoring |
| AI-generated insecure patterns | SO 2025, McKinsey | Ops Readiness | Security posture scanning, injection detection |
| Rework rate (new DORA metric) | DORA 2025 | Change Safety | Churn analysis, rapid re-edit detection |
Measure your baseline before AI amplifies it
DORA's amplification thesis has a clear implication: you need to know the quality of your engineering practices before AI scales them. If your codebase has complexity hotspots, convention drift, knowledge silos, or empty tests—AI will multiply all of it.
DX Coach measures exactly the layer DORA identifies as the determining variable. It reads your codebase's structural fingerprints against a catalog of engineering standards—complexity, duplication, test quality, coupling, security posture, documentation freshness, knowledge concentration—and tells you where to invest before AI amplifies what's already there.
No surveys. No dashboards that only show you what you want to see. Deterministic signals from the artifact itself, translated into practices from an open, DORA-aligned playbook.