Design-Stage Security for AI Coding Agents

The AI is going to write the code. Make sure it knows what not to write.

Foresight follows your security intent from design through implementation, so that what you intend is what you ship.

See Foresight read your spec →

42%

of new code is now AI-generated or assisted.

SonarSource, State of Code 2026

62%

of AI-generated code is insecure or incorrect.

BaxBench (ETH Zürich), 2025

50%

of security defects are design flaws no scanner sees.

IEEE Center for Secure Design

Scanners see code after it exists. Foresight reviews the intent before the code is written.

THE LOSING GAME

You can't scan your way out of a design decision.

Scanners can't catch an IDOR. It's a valid request that returns someone else's data, and the tool has no idea who that data belongs to. That's broken access control — the #1 risk in the OWASP Top 10:

BROKEN ACCESS CONTROL · OWASP A01

318,487

occurrences across 19,013 CVEs, more than any other category

Decide it at the design stage, and you close the whole class.

Source: OWASP Top 10:2021, A01 Broken Access Control.

Catch it at the design, and the downstream cost collapses.

30×

cheaper to prevent than to fix a flaw

A flaw caught while it's still a sentence in the spec is a one-line edit. In production it requires finding, triaging, fixing, and merging — before it gets exploited.

Relative cost to fix by stage — software cost-of-defect research (Boehm; NIST)

How a design flaw becomes a shipped vulnerability.

It bleeds forward from the spec, through the build, into what ships, getting harder to see at every step.

01 · AT THE SPEC

The flaw is in the words, before any code exists

"Access will be appropriately restricted" isn't a promise; it's a hidden decision no one consciously made. The agents building from it inherit the gap.

02 · INTO THE BUILD

The intent never reaches whoever builds it

A ticket says what to build, not what to protect. Whoever builds it, human or agent, never sees the security intent that lived only in the design.

03 · AT THE PR

The shipped code quietly breaks the promise

By the time it's a pull request, the flaw is already shipping, and nothing downstream is checking it against what was intended.

The kind of thing no human review keeps up with.

The Foresight Home dashboard: every PRD scored by risk, with drift, shadow features, and broken promises flagged across the product — Every design, scored and watched: risk, drift, and broken promises across all your PRDs.

"These are the kind of things you can never track as a human, because you could never keep up with all the PRs scattered across your organization. We now have the ability to."— Arshan Dabirsiaghi, Co-founder & CTO, Pixee

How Foresight catches it before it ships.

One model of your security promises, working from the spec to the pull request.

STEP 01

Read the design

Foresight ingests your PRDs, design docs, and tickets as they're written.

Confluence · Notion · Docs · Jira · Linear

STEP 02

Extract the promises

It pulls the security promises the design makes — explicit and implicit — and the misuse cases teams could never write at scale.

STEP 03

Propagate to tickets

It pushes those promises into your tickets, so engineering sees them at planning, not after launch.

DevRev · Linear · Jira

STEP 04

Flag the drift

When a PR lands, Foresight checks the shipped code against the design's promises and flags what broke — with a direct line back to the spec that promised it.

Example finding — a promise that drifted

Designmagic-link tokens are single-use

Shipped PRtoken accepted on repeated requests

Foresight⚑ drift — broken promise, tied back to the design line

Every finding ties to a spec line and a threat, logged as an auditable record — not another alert.

See Foresight read your spec →

Prevent what you can. Remediate what you can't.

VulnOps: Triage & Fix handles the code you have. Foresight secures the design. One context graph powers both.

The Pixee platform as an infinity loop: Foresight (proactive · the design) on the left and Triage & Fix (reactive · the code you have) on the right, each feeding one context graph at the center.

VulnOps results

76%

fix-merge rate

95%

fewer false positives

The average org carries 569,354 open alerts. Foresight helps stop design-stage risk from becoming tomorrow's backlog.

It runs on your model — Azure OpenAI, OpenAI, or Anthropic. Your tenancy, your data boundary, no lock-in.

The Pixee Context Graph as four stacked layers: Raw Context (what exists), Process Context (what should happen), Kinetic Context (what's actually exploitable), and Human Feedback Context (what your team trusts).

The Context Graph

A private, per-customer model of your codebase, scanners, conventions, history, and architecture. Never shared, never used to train shared models. It's why Triage & Fix reads exploitability right for your deployment, and why Foresight knows what each design is supposed to protect.

Security doesn't start at the pull request. It starts at the design.

Enterprises already running Pixee

The objections you're already forming.

Won't this just flood my team with another queue of AI "findings"?

That's the failure mode we built against. The design stage is a handful of decisions, not a thousand code patterns — and every finding ties to a spec line you confirm or dismiss in one click. An audit trail, not another alert stream.

Couldn't I just ask Claude to threat-model my PRD?

For one PRD, yes — and you should. But a prompt only reviews the spec you remembered to paste in; Foresight reads every spec as it's written, tracks the promises into your tickets, and checks shipped code at PR time. A threat model you run once versus one that runs as a system.

Aren't most of these just bugs a scanner would catch?

No — that's the point. Half of security defects are design flaws, not bugs (IEEE): a decision like data exported with no encryption, or tokens that should be single-use but aren't. No pattern, no CVE — the code does exactly what the design told it to.

How does it fit what we already run — and our model?

Foresight reads whatever describes intent — Jira, Linear, Confluence, design docs — and runs on your own Azure OpenAI, OpenAI, or Anthropic instance, no lock-in. It doesn't replace your scanners or in-editor assistants; it's the design-stage layer they can't reach.

How is this different from Triage & Fix?

Triage & Fix is reactive — it cuts your backlog to what's exploitable and ships fixes developers merge (76%). Foresight is proactive — it secures the design so the backlog never forms. One platform, one context graph; most teams run both.

The next decade of security gets won upstream, at the speed of agents.

See Foresight read one of your designs.

Bring a real spec. We'll show you the promises it makes, the threats it misses, and where last sprint's code already drifted.

No credit card · ~30-minute working session · Your spec stays confidential

Book a walkthrough →