AI Wipes Coder’s Database: A Horror Story

AI Coding Assistants: A Double-Edged Sword for Software Development

The promise of artificial intelligence revolutionising software development is undeniable, with tools like Anthropic’s Claude Code empowering engineers to build and deploy applications at unprecedented speeds. However, as the technology matures, a growing number of cautionary tales are emerging, highlighting the significant risks associated with an over-reliance on these powerful AI assistants. Engineers are finding that while AI can accelerate workflows, it also introduces a new set of challenges, from subtle code errors to catastrophic system failures.

The Perils of Automation: When AI Goes Rogue

A stark example of these risks came to light when engineer Alexey Grigorev was using Claude Code to update a new website. What began as a routine task quickly spiralled into a crisis when the AI system, due to a minor configuration error on Grigorev’s new laptop, mistakenly identified the live production environment as a target for deletion. The AI proceeded to erase critical components, including the network, services, and most alarmingly, the database containing years of valuable course data.

Engineer Alexey Grigorev was using Claude Code—a popular Anthropic tool that helps developers write and run code—to update a new website.
At first everything seemed normal, until he realised the system had begun destroying the site’s live environment: the network, services and, most critically, the database holding years of course data.
The root cause was a small setup mistake on a new laptop that confused the automation about what was “real” and what was safe to delete, so it erased the actual production system instead of just cleaning up duplicates.

While Grigorev was eventually able to restore his data with assistance from AWS support, he reflected on the experience, admitting to having “over-relied on the AI agent.” By allowing the AI to execute changes end-to-end without sufficient human oversight, he had bypassed crucial safety checks that would have prevented the accidental deletion. “AI assistants are great and saving a lot of time,” Grigorev commented, “But I hope people learn from mistakes I made and incorporate the safeguards into their workflow.”

Anthropic’s Claude Code does offer configurable safety settings, allowing users to dictate how often the AI should seek confirmation before taking action and to prohibit certain actions without explicit permission. However, some developers opt for greater autonomy from the AI to maximise time savings.

Amazon’s AI-Assisted Outages: A Wake-Up Call

The potential for AI-assisted code to cause widespread disruption is not confined to isolated incidents. Last week, Amazon convened a high-level meeting to investigate a series of outages that plagued its website and app. Reports from various publications indicated that at least one of these significant system failures was linked to AI-assisted changes.

While an Amazon spokesperson initially described the meeting as a “regular weekly operations meeting” and stated that only one incident involved AI, attributing it to “user error” unrelated to the AI itself, internal Amazon documents later revealed a more concerning picture. Documents viewed by CNBC and the Financial Times initially cited “Gen-AI assisted changes” as a contributing factor to a “trend of incidents.” This reference was reportedly removed from the document before the meeting. Furthermore, a December outage at Amazon Web Services was linked to engineers allowing Amazon’s Kiro AI coding tool to make changes, an event Amazon has since categorised as “user error.”

The Productivity Paradox: Hype vs. Reality

The enthusiasm surrounding AI-assisted software development has reached a fever pitch, with companies eager to capitalise on reported dramatic productivity gains within AI labs. This has led to increased pressure on engineers to adopt AI tools, often without adequate safeguards in place. However, for large enterprises with complex legacy systems, the promised efficiency gains are proving more elusive, and the poor quality of some AI-generated code could become a significant Achilles’ heel.

Engineers across the industry report a growing trend of over-reliance on AI assistants. One anonymous Amazon engineer noted that developers are becoming so dependent on AI that they are “essentially stopping reviewing the code altogether.” This shift transforms skilled developers into mere reviewers, with the AI handling the bulk of the implementation. While this can lead to faster feature delivery, it also generates “production noise” – code that is pushed out quickly but may be unnecessary or inadequately tested, potentially impacting critical systems.

David Loker, VP of AI at CodeRabbit, points out that the consequences of flawed AI-generated code aren’t always immediately apparent. He cited an instance where an AI assistant produced seemingly valid code that was based on faulty assumptions about the underlying system. This code, which might have passed a cursory review, would have caused the production database to crash.

The Burden of AI-Generated Code: A Growing “Correction Tax”

The ability of AI to lower the technical barrier for certain software development tasks has also led to the outsourcing of responsibilities traditionally handled by senior engineers to junior or less technical staff. This approach often backfires, as low-quality output generates more work than it saves. A London-based engineer at an enterprise software company, speaking anonymously, described the output as “fairly bad quality, broke often, and ended up being more of a burden.” The time saved by using cheaper labour is negated by the need for highly paid senior engineers to fix the resulting issues.

Data suggests that the task of reviewing and rectifying AI-assisted work is disproportionately falling on more experienced engineers. While senior engineers possess the skills to identify subtle errors that junior developers might miss, enabling faster deployment, they are also incurring a growing “correction tax.” A July 2025 Fastly survey indicated that senior engineers deploy nearly 2.5 times more AI-generated code than their junior counterparts, primarily due to their superior ability to catch mistakes. However, nearly 30% of seniors reported that fixing AI output consumed most of the time they had saved, compared to 17% of junior developers. Junior developers, lacking the full understanding of technical debt and latent vulnerabilities introduced by AI, may perceive greater productivity gains in the short term.

Measuring Success: Beyond Throughput

The current metrics for evaluating AI coding capabilities are also coming under scrutiny. A study by METR, an AI evaluation organisation, found that half of AI coding solutions that passed a prominent industry test – itself graded by an AI model – would have been rejected by human reviewers for inadequate quality. Toby Ord, Senior Researcher at the Oxford Martin AI Governance Initiative, believes that current estimates of AI coding ability are “indeed overstating things, and perhaps by a significant factor.”

Loker also raises concerns about how companies measure the “success” of AI coding. While throughput increases are easily quantifiable, the downstream consequences are not. Traditional metrics like features shipped and code committed may appear strong with AI involvement, but they fail to capture the time spent debugging, rolling back changes, or cleaning up technical debt. This can lead to a misleading impression of overall code health.

Companies deploying AI at scale risk accumulating significant technical debt – code that works in the short term but becomes increasingly costly to maintain. Loker estimates that the rate at which technical debt is being generated using AI is “three to four times what it was previously.” As organisations like AWS and Nvidia grapple with vast amounts of legacy code, finding context for AI to understand and modify becomes more challenging, increasing the potential for introducing new problems.