Your team is writing more code than it ever has. Developers feel faster, and the adoption dashboards agree. Yet the amount of work reaching production hasn't risen as much as the input did, and the gap between the two keeps widening. AI reduced the cost of producing code while everything downstream kept running at the same speed it always had. The system didn't get faster end-to-end. It got lopsided, and the first place the imbalance shows up is in code review.
The constraint moved downstream
For a decade, the binding constraint on most engineering teams was writing the code. Generative AI removed that constraint almost overnight. A developer with an AI assistant produces more code, in more pull requests, than the same developer did a year ago.
What didn't change is the capacity to review that code. Reviewers are human, they read at the speed they always have, and there are no more hours in the day. So the work piles up at the review stage. The constraint moved from producing code to verifying it, and verification is now where your delivery system stalls.
Executives miss this when they look at the top of the funnel, see accelerated developer activity, and assume the gains propagated through the rest of the software development lifecycle. They didn't. In several places the gains created new problems that absorb the velocity at the top and never reach production. Code review is the most visible of them.
What 8 million pull requests show
In its 2026 Software Engineering Benchmarks Report, LinearB analyzed more than 8 million pull requests across thousands of engineering teams. Two findings explain the stall.
AI-assisted pull requests are 2.6 times larger than human-written ones. A reviewer who used to open a focused, self-contained change now opens a sprawling one with more surface area, less context, and a higher rate of subtle issues to catch. Additionally, agentic AI pull requests sit idle 5.3 times longer before a reviewer picks them up. An agent generates the code in seconds, and then it waits, because the humans who have to approve it are already underwater.
Put the two together and the picture is clear. Your team produces more code than ever, and a smaller share of it moves through review and into production. Supply is racing ahead of the system's ability to process it.
Further, the 2025 DORA report found that AI simultaneously increases software delivery throughput and delivery instability, and its central conclusion is that AI amplifies the system you already have. Pour more code into a review process that was already strained, and it buckles instead of improving.
Why your activity metrics hide the problem
Adoption rates, token consumption, and seats activated keep climbing the whole time, which is what makes them dangerous. They measure how much your team is doing, not what the system is shipping. A developer who logs into Copilot every morning produces usage telemetry, while pull requests queue up unreviewed behind it.
The question your executives are asking is a system question. They want to know whether the engineering organization got faster end-to-end, with the same or better quality, against the goals the business cares about. Activity metrics can't answer that, because a team can post record adoption numbers while throughput out the bottom of the system stays flat or declines.
Where the bottleneck becomes measurable
Two metrics expose this, and they work best as a pair. Cycle time measures how fast work moves. Change failure rate measures whether that speed holds up.
Cycle time is the metric that most commonly surfaces AI bottlenecks. It measures how long work takes to move from start to finish, which makes it your read on iteration rate, and it breaks into four parts you can each measure on their own.
- Coding time: how long a developer spends writing the code
- Pickup time: how long a pull request waits before someone starts reviewing it
- Review time: how long the review itself takes
- Deploy time: how long merged code waits before it reaches production
When AI accelerates coding time and overall cycle time fails to decrease, the absorbed velocity is hiding in pickup time and review time. Decomposing cycle time turns a vague sense that review feels slow into a specific number you can act on.
Change failure rate deserves equal weight, and it's the cleanest way to open a conversation about software quality. It measures how often a change leads to a failure that needs remediation, such as a rollback, hotfix, or patch. The goal isn't to push iteration rate at the expense of stability. You want to improve cycle time while keeping change failure rate consistent, and holding both in view is what separates a team that's genuinely faster from one that's quietly accumulating risk.
Together, cycle time and change failure rate form the efficiency pillar of APEX, LinearB's measurement framework for engineering productivity in the AI era. Efficiency is where the bottleneck shift becomes operational, and it's where most teams start, because these two metrics give you a fast read on system health without waiting on a survey program or a planning overhaul.
Closing the gap with automation
A metric is useful only when you've got a plan to act on what it shows. When pickup time and review time are the constraint, the fix is to give the review process the same kind of support AI gave the authoring side.
Not every pull request needs the same scrutiny. A documentation change or a dependency bump doesn't carry the same risk as a change to authentication logic, and treating them identically is what clogs the queue. Workflow automation lets you encode that judgment, routing each change to the reviewer with the most context, fast-tracking the low-risk ones, and reserving human review for the changes where it matters. LinearB AI code reviews handle the routine checks on top of that, flagging the mechanical issues automatically so a human only weighs in when there's a real decision to make.
This is how teams unwind the bottleneck in practice. Yum! Brands automated 35.8 percent of its pull requests this way, which returned 321 developer hours a month to engineering. Syngenta combined the same workflow discipline with cycle time measurement and cut its cycle time by 81 percent.
The takeaway for engineering leaders
AI exposes the constraints downstream of writing code, and it moved fast enough that the strain often shows up before the measurement model catches it. If your current metrics feel like they're struggling to answer the questions your executives are asking, that's the gap to close.
The move is to measure whether generation speed actually survives the trip to production, rather than treating that speed as the result on its own. Code review is the first place to look. Decompose your cycle time, watch change failure rate alongside it, find where the velocity gets absorbed, and put automation on the stage holding the system back.
We built LinearB AI code reviews and workflow automation around a pattern the data keeps showing us, that code review is the most common bottleneck in the AI-driven SDLC. They handle routine checks and review routing automatically, so your reviewers can spend their judgment where it counts and your cycle time reflects the work that actually ships. Book a demo to see how it works on your pipeline.




