DORA 2024: AI Is Speeding Up Delivery and Hurting Stability at the Same Time

The DORA 2024 report has a finding I cannot stop thinking about. AI adoption is positively associated with software delivery throughput, and it is negatively associated with delivery stability. Both signals are in the same dataset, pointing in opposite directions. Teams that adopted AI coding tools shipped more code. They also broke more things. The report does not frame this as a contradiction. I think it is the most honest finding in the whole document, and I think most organizations are reading only the first half of it.

I want to sit with what that pairing actually means before I try to explain it. The throughput number is the one in the press releases. "AI makes developers faster." That is real. A developer who used to spend an hour writing a data transformation function can now get a working draft in five minutes, spend ten minutes reading it carefully, and ship something in fifteen. That is not a small gain. At team scale, across hundreds of pull requests a month, the acceleration is measurable and it shows up in the DORA data. I do not think anyone should dismiss that finding.

The stability number is the one that should make engineering leaders uncomfortable. Change failure rate, the share of deployments that cause production incidents or require rollback, went up for organizations that adopted AI development tools without corresponding investment in test automation, version control discipline, and fast feedback infrastructure. The DORA 2024 report is explicit about the mechanism: AI accelerates development and exposes weaknesses in the surrounding system. The weakness was always there. AI makes the code arrive faster, which means the weakness arrives faster too.

This is where Cohen and Levinthal's (1990) concept of absorptive capacity becomes genuinely useful as an analytical frame, not just a citation. Absorptive capacity is an organization's ability to recognize the value of external knowledge, assimilate it, and apply it to produce outcomes. Cohen and Levinthal argued that prior related knowledge is what makes absorption possible. You cannot absorb knowledge that you have no prior framework to connect it to. The same principle applies to AI tools. An organization that has already built strong automated testing practices, that already runs code through a CI/CD pipeline that catches regressions before they deploy, that already treats observability as a first-class concern, has the prior related knowledge to absorb AI code generation safely. The generated code flows through a filter the organization already built. The organization with no automated tests, with manual deployment processes, with no way to know something broke until a customer calls, has no prior structure to connect AI generation to. The code still arrives fast. Nothing filters it.

The DORA 2024 report found that 90 percent of organizations have adopted at least one platform engineering practice, and there is a direct correlation between platform quality and the ability to extract value from AI development tools. That finding follows directly from the absorptive capacity logic. High-quality platform engineering is what makes the filter. It is the accumulated organizational prior knowledge that determines whether AI-generated code improves delivery or destabilizes it. The 90 percent adoption figure sounds encouraging until you notice that adoption and quality are different things. An organization can have a platform that technically exists and is also a patchwork of undocumented scripts that nobody fully understands. That platform does not create absorptive capacity. It creates the appearance of process.

Feldman and Pentland's (2003) routine dynamics framework offers another useful lens here. They distinguished between the ostensive dimension of a routine (the documented, official version of how something is done) and the performative dimension (what people actually do in practice). In software engineering, the ostensive process might say "all code is reviewed before merging and all tests must pass." The performative reality might be that reviews are cursory because the team is under deadline pressure and tests are skipped when someone needs to deploy a hotfix. AI tools do not interact with the ostensive process. They interact with the performative one. If the performative reality of a team's deployment process is "we ship when it feels ready," AI acceleration means things that feel ready arrive faster and the instability follows them.

The finding from DORA 2024 that organizations with poor platform quality see AI as a burden rather than an enabler is the sentence I come back to most. A burden. Not neutral. Not a modest disappointment. The experience for teams without the supporting infrastructure is that AI tools make their jobs harder, not easier, because the volume of code to review, debug, and clean up increases without the tools to manage that volume safely. The AI is generating code faster than the team can absorb it.

What the DORA data points toward, though it does not say it this way, is that the value of AI coding tools is almost entirely a function of the organizational infrastructure they are deployed into. The tools are capable. The code assistants on the market in 2025 and 2026 can produce working, reasonable code for a wide range of tasks. The variance in outcomes is not variance in the tools. It is variance in the organizational conditions that determine whether that code gets caught, reviewed, tested, and shipped safely, or whether it goes straight to production and sits there until something fails.

I find this worrying as an IS researcher for a reason that is not primarily about software delivery. The organizations most aggressively buying AI coding tools right now are often the ones that feel farthest behind on AI adoption. They feel pressure to show AI usage, to demonstrate AI value, to compete with organizations that have been talking about AI productivity gains for two years. That pressure creates exactly the wrong incentive. It pushes organizations to adopt the visible tool (the AI assistant) before they invest in the invisible infrastructure (the platform quality) that determines whether the tool creates value or creates instability. The DORA data is telling us that sequence matters enormously. I am not sure the market is listening. And I think that gap between sequence and pressure is something IS research has not yet studied carefully enough.

---
claims_checked:
- "DORA 2024: AI adoption improves deployment throughput but hurts delivery stability": "https://dora.dev/research/2024/dora-report/"
- "DORA 2024: High-quality platform engineering correlates with higher AI value realization": "https://dora.dev/research/2024/dora-report/"
- "DORA 2024: Organizations with poor platform quality see AI as a burden, not an enabler": "https://dora.dev/research/2024/dora-report/"
- "DORA 2024: 90% of organizations adopted at least one platform engineering practice": "https://dora.dev/research/2024/dora-report/"
claims_unverified:
- "Cohen and Levinthal (1990) absorptive capacity: drawn from IS theory background, not re-fetched this session"
- "Feldman and Pentland (2003) routine dynamics: drawn from IS theory background, not re-fetched this session"
sources_used:
- "https://dora.dev/research/2024/dora-report/"
word_count: 1060