Published on 8 December 2025

The High Cost of AI-Generated Infrastructure

The outages of late 2025 represent a significant correction in how we build infrastructure. In a matter of weeks, systems responsible for 20% of global traffic failed. These incidents were not caused by cyberattacks or hardware faults. They were the result of engineers automating processes they no longer fully control or understand.

Cloudflare experienced a network-wide failure on November 18. Azure Front Door went down globally on October 29. AWS lost DNS services in the us-east-1 region on October 19. While SREs worked to restore these services, corporate leadership continued to tout the adoption of generative AI. Microsoft and Google executives have recently highlighted that 20–30% of their codebases are now AI-generated.

They frame this metric as efficiency. However, those responsible for system stability see it as rapidly accumulating technical debt. The infrastructure failures of 2025 happened because engineering rigor was replaced by high-velocity generation. We are discovering that Large Language Models (LLMs) do not understand the necessity of safety mechanisms.

Automation Without Judgment

Cloudflare

On November 18, a database permission change triggered a service failure. Cloudflare’s Bot Management module automatically generates a configuration file every five minutes using a pattern-based approach.

The permission change caused a database query to return duplicate metadata. The generation logic lacked the context to identify the duplication error and built a configuration file that exceeded hard-coded size limits. The system entered a loop where it pushed the invalid file, failed, reverted, and attempted the same operation five minutes later.

Cloudflare had established size limits for safety. The automation responsible for generating the config disregarded the system’s logic, producing output that violated those constraints. We automated the complexity but failed to automate the judgment required to manage it.

Azure and AWS

Azure Front Door faced a similar issue on October 29. A configuration change reached the global control plane without a kill switch or a staggered deployment check to intercept it. The update propagated immediately, forcing thousands of edge nodes into an unhealthy state that required a manual rollback.

Earlier that month, AWS DNS failed due to a race condition between two “DNS Enactor” processes. This type of concurrency bug is often caught during human review. Automated generation logic tends to miss these issues because it prioritizes “happy-path” execution. The oversight resulted in a 15-hour service degradation for DynamoDB users in us-east-1.

The “Vibe Coding” Problem

The industry has drifted toward a methodology often called “Vibe Coding.” This involves prompting an LLM until the code appears correct and passes basic tests.

While this approach seems productive, the resulting code often lacks systemic intent. Research from SonarSource identifies this as the “Engineering Productivity Paradox.” Developers are merging more pull requests and completing more tasks, but code review times have increased significantly. Reviewing AI-generated code is often more difficult than writing it from scratch because reviewers must check for hallucinations—errors that look syntactically correct but fail under specific conditions.

A recent study indicates that 67% of developers now spend more time debugging AI-generated code than they previously spent writing manual code. The workload has shifted from creation to cleanup.

The Economic Incentives

The root cause is not just the tools, but the incentives surrounding them.

Engineers are adopting these tools to meet increasing demands. Since the layoff cycles began in 2022, companies have reduced headcount while maintaining or increasing feature velocity targets. The message to the workforce is that speed is the primary metric.

Mid-level engineers are incentivized to use coding assistants to close tickets quickly. Stopping to refactor or question the generated logic lowers velocity metrics, which can be risky in the current employment climate.

This creates a negative feedback loop:

Executives demand efficiency to justify AI investment.
Engineers generate boilerplate code to meet velocity targets.
Senior engineers, who often provide necessary oversight, are reduced due to cost.
The remaining team lacks the context to effectively review the output.
System stability degrades.

We are eroding institutional knowledge. As Corey Quinn noted regarding the AWS outage, the senior staff who understand the deep history of these systems are often the first to leave, replaced by tools that can write syntax but cannot reason about failure domains.

The Untrusted Input Problem

The fundamental error in the Cloudflare and Azure incidents was that the systems trusted the internal generator.

In security, user input is treated as “untrusted.” It is sanitized and validated under the assumption that it could be malicious. However, we often treat AI-generated code and configuration as “trusted” simply because it originates from an internal pipeline.

AI-generated code is probabilistic and prone to edge-case failures. It should be treated with the same scrutiny as external input. We need “Policy as Code” guardrails to validate the semantic validity of every artifact generated by automation, not just its syntax.

Reliability Over Volume

The lesson from late 2025 is that lines of code is a poor metric for productivity.

executives citing “30% AI-generated code” are essentially measuring bloat and increased attack surface. Real engineering productivity is measured by reliable value delivery. A senior engineer who removes code to prevent an outage contributes more value than an engineer generating thousands of lines of unmaintainable code.

The internet relies on stability and institutional memory. Replacing that memory with autocomplete without adequate verification mechanisms ensures that these outages will become a baseline occurrence rather than an anomaly.

← Back to all posts