SLSA Cleared the Malware. Scanners Missed the Zero-Day. OpenAI Named What Comes Next.

May 12, 2026

Big Picture

Last week, Oracle quietly admitted AI necessitated changes to how it releases patches. This week, three more layers of AppSec instrumentation failed. Then OpenAI launched Daybreak, its response to Mythos in some sense.

The frontier labs continue to push into security in ways both promising and noise-generating. This week it was OpenAI announcing the launch of daybreak to bring security more natively to "change the way software is built and defended." Not sure from reading their press release exactly what that means but I can tell you that we are hearing constantly from our clients and prospects about whether they can just point frontier models at the problem and call it a day (TL/DR: you can't, our CEO has written extensively about this as has our CTO in terms of the context engineering, harness, and enterprise deployment realities required to make these workflows cost effective and efficient).

In other news more compromised npm packages exposed the supply chain side yet again as a prime attack vector, Google confirmed AI was used to find and develop a working zero-day exploit, and MCP insecurity continues to be a problem.

More below.

TL;DR

The supply chain nugget: Shai-Hulud worm hit 169 npm packages with valid SLSA Build Level 3 attestations. The framework you trusted to filter compromised packages cleared them. Aikido forensics
The threshold crossed: Criminals — not state actors — used AI to build a working 2FA bypass against a semantic logic flaw scanners cannot detect. Google GTIG confirmed in-the-wild use. CSO Online
The exposure data: 402,599 AI agent hosts plus 1,800+ MCP servers reachable from the internet without authentication. Default configurations, not advanced threats. Knostic research
Last week, Oracle quietly admitted AI necessitated changes to how it releases patches. This week, three more layers of AppSec instrumentation failed. Then OpenAI launched Daybreak, its response to Mythos in some sense.
Weekly Intel

AppSec Weekly

The briefing security leaders actually read. CVEs, tooling shifts, and remediation trends — every week in 5 minutes.

Weekly only. No spam. Unsubscribe anytime.

The AI-Generated Zero-Day Is No Longer Theoretical

Google's Threat Intelligence Group confirmed that a cybercrime organization used AI to develop a working zero-day against an open-source web administration tool. The exploit bypassed two-factor authentication by targeting a semantic logic flaw: a developer had hardcoded a trust assumption that contradicted the application's authentication enforcement. Pattern-matching scanners cannot find this vulnerability class. The AI did.

State-backed actors from China and North Korea have used AI for vulnerability discovery for some time. What changed this week is the criminal side. Google's report marks the first confirmed case where a non-state threat actor shipped an AI-assisted exploit to production. The flaw class matters: semantic logic errors, authentication bypasses, and hardcoded trust assumptions are what static analysis tools miss because they require understanding application intent, not just code syntax.

Takeaways

The detection gap this creates is specific. AI identifies vulnerabilities through logic analysis that produces zero scanner alerts before exploitation. If your team's signal problem comes from scanners flagging too many low-confidence findings, the inverse problem is now confirmed: the findings that matter most may not be in the queue at all.

SLSA Level 3 Passed Malware Through

The Shai-Hulud campaign produced the first npm supply chain attack with valid SLSA Build Level 3 attestations, compromising 169 packages across TanStack, Mistral, Squawk, and UiPath. The attack chained a GitHub Actions "Pwn Request," cache poisoning, and OIDC token extraction from runner memory. The resulting packages were cryptographically signed by legitimate build infrastructure. Downstream consumers and dependency scanners had no basis to reject them.

Aikido's forensics team traced the credential harvest to CI/CD secrets, npm tokens, and GitHub API keys from any machine that ran a postinstall script. The payload executed automatically on npm install. The same week, TeamPCP published a backdoored Checkmarx Jenkins AST plugin to the Jenkins Marketplace, and a malicious Hugging Face repository impersonating OpenAI's Privacy Filter accumulated 244,000 downloads before removal.

Takeaways

Three attack vectors, three supply chain layers, one week. Package repositories, CI/CD tooling, and AI model repositories all fell in the same news cycle. The Shai-Hulud case invalidates a core SLSA assumption which is that trustworthy build infrastructure means trustworthy attestations.

LiteLLM Owned in Three HTTP Requests at Pwn2Own Berlin

A researcher chained three bugs in LiteLLM v1.82.3-stable from internal_user API key to root RCE in three HTTP requests at Pwn2Own Berlin 2026, winning a $40K bounty. The chain matters more than the bounty. LiteLLM is the proxy layer in front of frontier models for many of the enterprises currently shipping AI features. If you wrapped GPT-5 or Claude behind LiteLLM to centralize keys, rate-limit, or route across providers, the path from "intern with an API key" to "root on the proxy host" is now three HTTP requests.

Two of the three bug classes are SSTI in template-rendered fields that an internal-user-tier token can write to. The third is path-traversal in a logging endpoint. CWE-94, CWE-22, CWE-78: pattern-matchable, indexable, the kind of bug a competent SAST run should have caught before merge. The fact that they shipped in a release tagged "stable" tells you AI infrastructure projects are getting velocity-driven the same way early web frameworks were in 2009: feature shipping has outrun the security review baseline that mature web frameworks now enforce by default.

Takeaways

No confirmed in-wild exploitation yet. The Pwn2Own writeup is public and the exploit path is documented. Treat the gap between "demonstrated at a research venue" and "scripted into a botnet" as days, not weeks.

,000 AI Agent Hosts Running Without Authentication

Capsule Security's internet scan found 402,599 AI agent hosts directly reachable from the public internet across 36 services, with no authentication requirement. A concurrent Knostic analysis found 1,800+ Model Context Protocol servers exposed without authentication. Both findings document the same pattern: AI agent deployment velocity is outrunning basic security configuration.

AI agent framework designers did not build in the security-by-default assumptions that shaped web application deployment over the last decade. MCP launched without authentication in the base spec, and organizations that deployed quickly inherited that default. The result is an attack surface that is public-facing, broadly scoped, and largely undocumented in most security inventories.

Linux kernel maintainers proposed a parallel response to a different flavor of the same problem: an emergency runtime "killswitch" to disable vulnerable kernel functions while patches are built and distributed. The prompt was Dirty Frag (CVE-2026-43284, CVE-2026-43500), a privilege escalation pair under active exploitation across Ubuntu, RHEL, CentOS, and OpenShift. Both the MCP exposure data and the killswitch proposal reflect the same operational gap: systems deploy faster than security configuration follows, and exploitation concentrates in that window.

Takeaways

Vulnerabilities in the Wild

Actively Exploited

CVE-2026-43284 + CVE-2026-43500 — Dirty Frag — Linux Kernel — Privilege Escalation

Local → root via chained xfrm page-cache write and RxRPC memory fragment flaws. Microsoft confirmed in-the-wild exploitation. Exploits entry points include SSH, web-shell, container escape, and low-priv processes. Affected: Ubuntu, RHEL, CentOS, AlmaLinux, OpenShift. Patch and reboot. Linux kernel maintainers are evaluating a runtime killswitch as an interim option.

Supply Chain Incidents (Active — Rotate Credentials)

Shai-Hulud npm worm — 169 packages compromised — TanStack, Mistral, Squawk, UiPath

GitHub Actions OIDC token extraction produced valid SLSA Build Level 3 attestations for malicious packages. Payload auto-executed on npm install. If your environment ran affected packages May 11-12, rotate all CI/CD secrets, npm tokens, and GitHub API keys. Aikido forensics report.

Checkmarx Jenkins AST Plugin backdoor — supply chain compromise via credential theft

TeamPCP published a backdoored version to the Jenkins Marketplace after stealing credentials from Checkmarx's GitHub repository. Revert to the safe plugin version (see Checkmarx advisory). Rotate all secrets from Jenkins environments where the malicious version ran.

Disclosed This Week

LiteLLM 3-bug RCE chain (CVE-2026-42203) covered in Deep Dive above.

SAP Commerce Cloud + S/4HANA — Critical RCE and Data Access — May 2026 Security Update

15 CVEs in May patch bundle, including two critical flaws: one enabling unauthenticated code execution in Commerce Cloud, one enabling data theft in S/4HANA. No active exploits confirmed. Apply within your standard SAP patch cadence. Exploit-risk does not currently warrant out-of-cycle deployment.

dnsmasq — 6 new CVEs — Heap overflow, heap corruption, code execution, DNS cache poisoning

Memory safety cluster across DNS response parsing and DHCP handling. Enables cache poisoning, security control bypass, and local privilege escalation. High exposure due to dnsmasq prevalence in routers, IoT, and container networking. CVE numbers pending full disclosure. Update when patches are available.

Curated Reading List

Five links that add to this week's story — not covered in the Deep Dives above.

Primary Source

Google GTIG: AI-Assisted Vulnerability Exploitation and Initial AccessGoogle Cloud Blog The actual report behind Deep Dive 1. Hallucinated CVSS scores and educational docstrings were the AI fingerprints. Google's Big Sleep agent (DeepMind + Project Zero) independently discovered the same flaw before mass exploitation. Read the primary source, not the headlines.

Technical Deep Dives

Dirty Frag: Full Exploit WriteupHyunwoo Kim (@v4bel) The complete PoC walkthrough for CVE-2026-43284 + CVE-2026-43500. Chains two page-cache corruption bugs (xfrm/ESP + RxRPC) into a deterministic local-to-root with no race condition. Affects kernels back to 2017. If you run Linux in production, read this before your next patch window.

When the Worm Forged Its Own Security CertificateLucie Cardiet, Vectra AI Deep forensics on how Shai-Hulud extracted OIDC tokens from GitHub Actions runner process memory, obtained signing certificates from Sigstore's Fulcio CA, and produced valid SLSA Build Level 3 attestations for malicious packages. The best technical breakdown of why SLSA failed.

Thought-Provoking

Defender's Guide to Frontier AI Impact on Cybersecurity: May 2026 UpdatePalo Alto Networks, Unit 42 Unit 42 estimates organizations have 3-5 months before attackers broadly access frontier AI cyber capabilities. They scanned 130+ products, found 75 legitimate vulnerabilities. The clock on the AI offense-defense gap has a number on it now.

AppSec Didn't Need a Faster Way to Find BugsSecurely Built (Substack) Contrarian take on Daybreak: the problem was never finding more bugs. Scanning tools already produce hundreds of thousands. What defenders actually need is exploitable proof and self-healing environments. The strongest pushback on the "AI finds vulns" narrative this week.


Pixee Research & Long-Form (For When You Have 30 Minutes Instead of 8)

What We Learned Reading 30 Security Reports So You Don't Have To — A 2026 industry meta-analysis. We extracted 155+ quantitative data points from 30 reports across DBIR-class threat intelligence, AppSec/DevSecOps surveys, AI code-security benchmarks, and supply-chain studies. Useful when the cross-vendor disagreement on a number (say, false positive rates or time-to-exploit) is itself the story.

The AI-Generated Zero-Day Is Here. Your Scanner Missed It. — Our deeper analysis of this week's Google GTIG disclosure: why semantic logic flaws fall outside SAST's detection grammar, the four code-pattern categories where this failure mode concentrates, and what an "AI-augmented review" pipeline looks like for the paths your scanner can't reach.

AI-Accelerated Offense: The Defense Gap CISOs Must Close — Why generic AI defense fails against the class of logic-based vulnerabilities this week's zero-day exploited. Background reading for the staff conversation about whether your existing tooling can close this gap or whether it's an architectural one.

Subscribe

Get the next one in your inbox.

AppSec Weekly lands every Tuesday — CVE breakdowns, remediation intel, and the tooling shifts that matter. No fluff. 5 minutes.

20+ editions published
5 min weekly read
Free always

Unsubscribe anytime. No spam.