Design-Stage Security for AI Coding Agents

The AI is going to write the code. Make sure it knows what not to write.

Pixee Foresight reviews your PRDs before code exists, extracts the security promises the design makes, turns them into tickets, and checks every PR for drift from what was promised.

42%
of new code is now AI-generated or assisted.
SonarSource, State of Code 2026
87%
of AI-agent code ships with a vulnerability.
Endor Labs / CMU, 2026
50%
of security defects are design flaws no scanner sees.
IEEE Center for Secure Design

Scanners see code after it exists. Foresight reviews the intent before the code is written.

THE LOSING GAME

You can't scan your way out of a design decision.

Despite thirty years of scanning for SQL injection, we've barely made a dent:

INJECTION · OWASP A03
274,228
occurrences across 32,078 CVEs

Decide it at the design stage, and you close the whole class.

Source: OWASP Top 10:2021, A03 Injection.
Catch it at the design, and the downstream cost collapses.
30×
cheaper to prevent than to fix a flaw

A flaw caught while it's still a sentence in the spec is a one-line edit. In production it requires finding, triaging, fixing, and merging — before it gets exploited.

Relative cost to fix by stage — software cost-of-defect research (Boehm; NIST)

How a design flaw becomes a shipped vulnerability.

It bleeds forward from the spec, through the build, into what ships, getting harder to see at every step.

01 · AT THE SPEC

The flaw is in the words, before any code exists

"Access will be appropriately restricted" isn't a promise; it's a hidden decision no one consciously made. The agents building from it inherit the gap.

02 · INTO THE BUILD

The intent never reaches whoever builds it

A ticket says what to build, not what to protect. Whoever builds it, human or agent, never sees the security intent that lived only in the design.

03 · AT THE PR

The shipped code quietly breaks the promise

By the time it's a pull request, the flaw is already shipping, and nothing downstream is checking it against what was intended.

The kind of thing no human review keeps up with.

The Foresight Home dashboard: every PRD scored by risk, with drift, shadow features, and broken promises flagged across the product
Every design, scored and watched: risk, drift, and broken promises across all your PRDs.
"These are the kind of things you can never track as a human, because you could never keep up with all the PRs scattered across your organization. We now have the ability to."— Arshan Dabirsiaghi, Pixee

How Foresight catches it before it ships.

One model of your security promises, working from the spec to the pull request.

STEP 01

Read the design

Foresight ingests your PRDs, design docs, and tickets as they're written.

Confluence · Notion · Docs · Jira · Linear
STEP 02

Extract the promises

It pulls the security promises the design makes — explicit and implicit — and the misuse cases teams could never write at scale.

STEP 03

Propagate to tickets

It pushes those promises into your tickets, so engineering sees them at planning, not after launch.

DevRev · Linear · Jira
STEP 04

Flag the drift

When a PR lands, Foresight checks the shipped code against the design's promises and flags what broke — like a single-use token a refactor quietly let you reuse:

Designmagic-link tokens are single-use
Shipped PRtoken accepted on repeated requests
Foresight⚑ drift — broken promise, tied back to the design line

Every finding ties to a spec line and a threat, logged as an auditable record — not another alert.

Prevent what you can. Remediate what you can't.

VulnOps: Triage & Fix handles the code you have. Foresight secures the design. One context graph powers both.

The Pixee platform as an infinity loop: Foresight (proactive · the design) on the left and Triage & Fix (reactive · the code you have) on the right, each feeding one context graph at the center.
VulnOps results — 76% fix-merge rate · up to 95% fewer false positives

The average org carries 865,398 open alerts. Foresight helps stop design-stage risk from becoming tomorrow's backlog.

Sources: Pixee VulnOps customer data · OX Security AppSec Benchmark, 2026.

It runs on your model — Azure OpenAI, OpenAI, or Anthropic. Your tenancy, your data boundary, no lock-in.
The Pixee Context Graph as four stacked layers: Raw Context (what exists), Process Context (what should happen), Kinetic Context (what's actually exploitable), and Human Feedback Context (what your team trusts).
The Context Graph

A private, per-customer model of your codebase, scanners, conventions, history, and architecture. Never shared, never used to train shared models. It's why Triage & Fix reads exploitability right for your deployment, and why Foresight knows what each design is supposed to protect.

Security doesn't start at the pull request. It starts at the design.

Enterprises already running Pixee
MoneyGram Oracle Olympus NTT Data Nippon Steel HCL DeltaStream Stirling PDF

Ninety seconds, PRD to PR.

The Foresight product reading a design: key findings, misuse cases, and the security promises it extracted
PRD ingest → threat model → promises into tickets → shipped code checked as drift

The objections you're already forming.

Won't this just flood my team with another queue of AI "findings"?

That's the failure mode we built against. The design stage is a handful of decisions, not a thousand code patterns — and every finding ties to a spec line you confirm or dismiss in one click. An audit trail, not another alert stream.

Couldn't I just ask Claude to threat-model my PRD?

For one PRD, yes — and you should. But a prompt only reviews the spec you remembered to paste in; Foresight reads every spec as it's written, tracks the promises into your tickets, and checks shipped code at PR time. A threat model you run once versus one that runs as a system.

Aren't most of these just bugs a scanner would catch?

No — that's the point. Half of security defects are design flaws, not bugs (IEEE): a decision like data exported with no encryption, or tokens that should be single-use but aren't. No pattern, no CVE — the code does exactly what the design told it to.

How does it fit what we already run — and our model?

Foresight reads whatever describes intent — Jira, Linear, Confluence, design docs — and runs on your own Azure OpenAI, OpenAI, or Anthropic instance, no lock-in. It doesn't replace your scanners or in-editor assistants; it's the design-stage layer they can't reach.

How is this different from Triage & Fix?

Triage & Fix is reactive — it cuts your backlog to what's exploitable and ships fixes developers merge (76%). Foresight is proactive — it secures the design so the backlog never forms. One platform, one context graph; most teams run both.

The next decade of security gets won upstream, at the speed of agents.

See Foresight read one of your designs.

Bring a real spec. We'll show you the promises it makes, the threats it misses, and where last sprint's code already drifted.

Book a walkthrough →