The 2.74× Problem: New Data Shows AI Code Ships With Nearly 3× More Security Flaws

Written by:

Surag Patel

Published on:

Dec 29, 2025

On This Page

TOC Element

Five independent studies converge on an uncomfortable truth about AI-assisted development.

The Data We Can't Ignore

CodeRabbit just published something that should change how we think about AI-assisted development.

Their research team analyzed 470 pull requests across open-source GitHub repositories—320 AI-assisted and 150 human-only. What they found is that AI-assisted PRs contain 2.74× more cross-site scripting vulnerabilities, 1.91× more insecure direct object references, and 1.88× more improper password handling issues.

The overall numbers tell the same story. AI-assisted code averages 10.83 issues per PR compared to 6.45 for human-only code. That's not a minor increase. That's 68% more defects per pull request shipping through your pipeline.

What distinguishes this research from typical "AI is dangerous" hand-wringing is that CodeRabbit categorized the types of issues they found and looked at explaining why AI code has these specific weaknesses.

Specifically, the security gaps cluster around validation and authorization—exactly the areas that require understanding context outside the immediate code being written. For example:

• XSS prevention requires knowing your output context

• Object reference security requires understanding your authorization model

• Password handling requires knowing your security policy

What's happening is that AI coding assistants see the code they're generating but they don't see your security architecture.

AI is an Army of Talented Junior Developers

OX Security's research team captured the "why" behind these issues perfectly when they stated recently that AI coding assistants operate like an "army of juniors."

Their analysis identified 10 distinct anti-patterns in AI-generated code—patterns that look correct in isolation but create problems in production systems. These aren't random bugs. They're systematic gaps in how AI approaches code generation.

Consider what happens when you ask a junior developer to implement authentication. They'll produce code that authenticates users. It will probably work. But will it handle session fixation? Will it implement proper CSRF protection? Will it follow your organization's specific security policies around token rotation?

A junior developer writes code that solves the immediate problem. A senior developer writes code that solves the immediate problem while anticipating the security implications, edge cases, and integration points they can't see in the ticket.

AI coding assistants are very good junior developers. They're fast. They're tireless. They produce syntactically correct code that often works on the first try. They've seen millions of code examples and can pattern-match their way to solutions that would take humans hours to research.

But they share the junior developer's blind spot: they optimize for the problem they can see, not the problems that emerge from context they can't access.

This isn't a criticism of AI capabilities. It's a recognition that code security isn't just about writing correct code—it's about writing code that's correct within a specific security context. And that context lives in your architecture, your policies, your threat model, and your deployment environment.

The "army of juniors" framing matters because it suggests the right response isn't to stop using AI assistants. It's to provide the senior-level review that junior code has always required.

Five Studies, One Pattern

The CodeRabbit and OX Security findings would be concerning enough on their own. But they're not isolated data points. Four additional industry studies paint the same picture from different angles.

Checkmarx's Future of AppSec 2026 report surveyed 900 security and development professionals. The headline finding: 34% of code in enterprise repositories is now AI-generated. But the more troubling data point is behavioral—81% of organizations admit to knowingly shipping vulnerable code. Not accidentally. Knowingly.

When a third of your codebase comes from AI assistants that produce measurably more vulnerabilities, and your organization has normalized shipping known vulnerabilities, the compounding math gets uncomfortable quickly.

Black Duck's Global DevSecOps Report 2025 adds the velocity dimension. Organizations using AI coding assistants report 88% confidence that they're managing AI-related security risks effectively. Yet the same survey reveals that only 18% have established policies governing AI code usage—a gap between confidence and governance that suggests organizations are flying blind.

The Black Duck data also quantifies the alert fatigue problem: 71% of security alerts are noise. When three-quarters of your security signals don't require action, teams learn to ignore signals entirely. Now add AI-generated code that produces more vulnerabilities, and you have more real problems hiding in a larger haystack of false positives.

Veracode's State of Software Security 2025 tracks the remediation side of the equation. Their finding: mean time to remediate has reached 252 days—up 47% over five years. Half of organizations now carry "critical" security debt, meaning vulnerabilities so severe they require immediate remediation but haven't been addressed.

The Veracode data reveals another pattern relevant to AI-generated code: 70% of critical security debt originates from third-party code, not first-party development. AI coding assistants pull heavily from training data that includes open-source patterns, library usage examples, and Stack Overflow solutions. If third-party code is already your biggest source of critical debt, AI assistants are essentially scaling your most problematic code source.

The convergence across these studies isn't coincidental. They're measuring different aspects of the same phenomenon: AI has dramatically accelerated code generation without a corresponding acceleration in security review and remediation.

The Velocity Gap Problem

The productivity gains from AI coding assistants are real. Developers ship faster and time-to-market compresses.

But security review hasn't kept pace with generation speed.

When a developer can generate 200 lines of functional code in 5 minutes instead of 2 hours, the bottleneck shifts entirely to review. Your AppSec team that was already underwater reviewing human-written code is now reviewing 3-5× the volume—and that volume contains, per the CodeRabbit data, 68% more issues per unit of code.

This is where Black Duck's velocity data becomes relevant. 81% of organizations report that security processes slow down development. When security is already perceived as a brake on velocity, adding AI-accelerated code generation doesn't create pressure to speed up security review. It creates pressure to skip it.

The Veracode remediation data shows what happens next. That 252-day mean time to remediate isn't a process failure—it's a capacity failure. Organizations don't take 8+ months to fix critical vulnerabilities because they enjoy risk. They take 8+ months because the queue is too long and the resources are too scarce.

AI coding assistants add code to the front of that queue faster than organizations can process the back of it. The result is predictable: security debt compounds.

Checkmarx found that 98% of organizations experienced at least one AppSec-related breach in the past year. The correlation isn't hard to trace. When velocity exceeds security capacity, vulnerabilities ship. When vulnerabilities ship, breaches happen.

The velocity gap isn't a technology problem. It's a systems problem. You can't solve it by telling developers to slow down—the competitive pressure for speed is real and legitimate. You can't solve it by hiring more AppSec engineers—the talent market is too constrained. And you can't solve it by hoping AI-generated code gets more secure on its own—the training data includes the same vulnerable patterns that created today's security debt.

Security Automation That Matches AI Velocity

The solution isn't more humans. It's automated remediation that operates at AI generation speed.

Think about what the CodeRabbit data actually reveals. AI-generated code has specific, predictable vulnerability patterns. XSS issues. Insecure object references. Improper password handling. These aren't exotic, novel vulnerabilities. They're well-understood patterns with well-understood fixes.

When vulnerability patterns are predictable and fixes are known, remediation is automatable.

The market trajectory over the end of Q4 2025 alone confirms this: ServiceNow's $7.75B acquisition of Armis, OpenAI's $3B acquisition of Windsurf, Checkmarx's acquisition of Tromzo. The strategic bet across all these deals: security automation that matches development velocity is now table stakes.

The specific capability these acquisitions are targeting is automated remediation—not just finding vulnerabilities, but generating fixes that actually get merged. Because finding more problems faster doesn't help if your remediation capacity is still the bottleneck.

The impact is measurable. When automated fixes achieve 76%+ merge rates—as Pixee delivers—they're actually solving the problem, not just adding to the backlog. When automation eliminates 80% of false positives before they reach human reviewers, AppSec teams can focus on the 20% that requires human judgment.

Automation handles the predictable patterns; humans handle the edge cases. That's not replacing human judgment it's focusing human judgment on problems that actually require it.

For AI-generated code specifically, automated remediation becomes even more valuable. If AI code produces more issues per PR but those issues follow predictable patterns, automated remediation can apply the "senior developer review" that AI output requires at scale (without adding headcount).

The Bottom Line

Five independent studies point to the same conclusion: AI-generated code ships with more security vulnerabilities, organizations lack governance infrastructure for AI code, and remediation capacity hasn't scaled to match generation velocity.

None of this means AI coding assistants are net negative. They're not. The productivity benefits are real and the adoption curve is not reversing.

But it does mean the security implications are real and require deliberate response. An army of juniors can be enormously productive with appropriate senior oversight. In the same way secure AI-generated code requires automated oversight of its own.

Sources: CodeRabbit State of AI vs Human Code Generation Report, OX Security "Army of Juniors" Research, Checkmarx Future of AppSec 2026 Report, Veracode State of Software Security 2025, Black Duck Global DevSecOps Report 2025