The Containment Era Begins

by Doug Merritt | Apr 29, 2026

TL;DR

I helped build the detection era at Splunk. The argument I am making here runs against my own history. But three forces are converging, and the math has decided.

Attackers are industrializing. AI is putting frontier offensive capability into millions of hands at consumer cost. And cloud is insecure by default, by design, because the providers optimized for developer velocity. The bullseye is AI workloads in the cloud, ephemeral, privileged, and rapidly shipped. They are increasingly the largest source of breaches in 2026.

Across more than ten thousand organizations, a 6.5x increase in remediation effort produced worse outcomes, not better ones. The percentage of critical vulnerabilities still unresolved at seven days rose from 56% to 63% even as defenders closed 6.5 times more tickets. The median time-to-exploit for the vulnerabilities that decide breaches has moved to negative seven days, meaning attackers operationalize before the catalog records. And 82% of intrusions ride valid credentials, the vector vulnerability management cannot address.

The industry has entered the Containment Era. When prevention fails and detection is too slow, containment decides whether the incident becomes a breach. And right now, that decision is being made hardest, and most consequentially, on AI workloads.

What This Looks Like in Production

In March 2026, a Fortune Global 500 enterprise with 46 billion dollars in revenue, 580,000 employees, and operations in 30 countries was running LiteLLM in their AI development environment. LiteLLM is an AI gateway proxy that, at the time of compromise, was running in roughly 36% of enterprise cloud environments. TeamPCP poisoned the package on PyPI. Approximately 40,000 environments pulled the malicious version before PyPI quarantined it three hours later. Credentials were harvested. Exfiltration to attacker-controlled infrastructure was attempted from thousands of compromised pods worldwide.

At our customer, the four known “command and control” (C2) IP addresses hit a Global IP Blocklist running on the Aviatrix Distributed Cloud Firewall. Compromised Kubernetes pods attempted to reach the C2 endpoints. Every attempt was logged as a DENY and blocked at the network layer. Zero credentials reached the attacker.

The detail that matters most is the honest one. This customer is not yet running default-deny Workload Containment with every workload mapped to an explicit allowlist. They are building toward that posture. What they had was a partial implementation of the architecture, a propagating blocklist policy with no agent on the workload and no human in the response loop. That partial implementation stopped a live supply chain exfiltration in production.

That block could not have happened at a centralized inspection point. LiteLLM runs as a Kubernetes workload, and traffic egresses to the internet through the node’s NAT gateway. The chokepoint never sees that path. And LiteLLM is an AI workload. That is not coincidence. That is the pattern.

The Toxic Combination

Three structural forces are converging, and each one accelerates the others. None of them is the whole story. All three together are the present landscape.

Attackers are industrializing

What we have been calling The Cascade is not a single breach to be branded. It is the visible signature of a criminal economy that has matured to industrial scale. Specialization of roles. Affiliate networks. Mutual-benefit contracts between initial access brokers, credential harvesters, ransomware operators, and money launderers.

In Q1 2026, TeamPCP industrialized credential harvesting across five major software ecosystems in twelve days, formed monetization partnerships with ransomware groups, and mobilized more than 300,000 BreachForums affiliates as a distribution layer. Five ecosystems in twelve days is not the work of a small cell. It is the output of a value chain operating with the throughput of a Fortune 500 supply chain. In the same window, North Korea’s UNC1069 backdoored the Axios npm package, which sees roughly 100 million weekly downloads. LAPSUS$ exfiltrated 2.66 GB from AstraZeneca on stolen credentials. The Vect ransomware group ran an affiliate program at scale. Four independent operations. Similar tradecraft. The same window.

This is the pattern that has been building since SolarWinds in 2020. Log4j expanded the trust surface to every dependency a workload inherits. 3CX showed cascading compromise where one supply chain attack originated from another. XZ Utils showed the adversary manufacturing trust itself, a multi-year social engineering campaign for maintainer access to a foundational compression library, caught by accident, not by detection. Q1 2026 was the moment that pattern became the default operating model for multiple threat actor groups simultaneously.

AI puts frontier offensive capability into millions of hands

In April 2026, Anthropic’s Claude Mythos model autonomously discovered 181 exploitable zero-day vulnerabilities in Firefox during a controlled safety evaluation. The previous generation found two. Days later, Mythos weaponized a 17-year-old FreeBSD kernel vulnerability in approximately four hours with no human guidance. The same week, China’s Z.ai released GLM-5.1, a 754-billion-parameter open-weight model with near-comparable capability under MIT license and no safety constraints.

Mythos is one data point on a curve, not the curve itself. The structural truth is that offensive capability is an emergent property of general AI capability improvement. Every frontier model going forward will have it. Every open-weight model will inherit it within months. Open weights cannot be recalled.

What this changes is not just capability. It is cost. Six months of skilled human labor becomes a few hours of compute. In any system where attacker cost collapses, attack volume scales nonlinearly. A single attacker now has the leverage to run hundreds of campaigns in parallel. The population of actors capable of exploiting zero-days is expanding by orders of magnitude, and the cost per exploit is collapsing toward consumer compute. AI does not create a new attack surface. It guarantees the existing one will be tested by far more hands, far more cheaply, far faster.

Cloud is insecure by default, and that is by design

The major cloud providers ship permissive defaults because friction kills developer velocity, and developers are their primary buyer. AWS security groups allow all outbound traffic. GCP has an implied allow-all egress rule, invisible in the firewall rules list. Azure NSGs default to AllowVNetInBound. AKS, EKS, and GKE all default to unrestricted pod-to-pod communication across the entire cluster. Gartner estimates that only 5 to 20% of enterprises have implemented microsegmentation in any form.

The shared responsibility model placed interior security on the customer. The customer did not accept that responsibility at scale, because the tooling to enforce it did not exist in a form that matched cloud’s shape. The result is a tangled web of interconnected workloads, most of them with direct internet egress and zero inspection. Most enterprises cannot map the communication graph of their own cloud, let alone govern it.

This is the leg of the stool the defender controls. You cannot stop attackers from industrializing. You cannot recall AI capabilities. You can change your cloud’s default posture. Containment is how.

Why each leg makes the others worse

The three forces are not additive. They are multiplicative. AI lowers attacker cost, which lets more actors enter, which is the precondition for industrialization. Industrialization specializes in cloud-native tradecraft, which targets the insecure-by-default surface. The insecure-by-default surface is so large that even untargeted attacks at AI economics find paths. Each leg accelerates the others. That is what makes it toxic.

Why AI Workloads Are the Bullseye

Read the 2026 breach data and one pattern dominates. The bullseye is AI workloads in the cloud.

LiteLLM, the AI gateway proxy, was the headline of The Cascade. Three weeks later, the Bitwarden CLI compromise carried a payload that specifically enumerated the configuration directories of Claude, Cursor, Codex CLI, and Aider, treating developer AI assistants as a concentrated source of cloud and repository credentials. GrafanaGhost weaponized Grafana’s AI assistant to silently exfiltrate data through an authorized rendering channel, with no anomalous signal for any detection tool to find. The Vercel breach moved through Context.AI, a third-party AI productivity tool. Cursor saw three malicious npm packages compromise more than 3,200 developer workstations. Anthropic’s MCP architecture had a “by design” remote code execution (RCE). A systemic MCP flaw exposed 200,000 servers and 150 million downloads of AI agent frameworks, IDE extensions, and automation tooling.

This is not coincidence. It is structure. AI workloads concentrate risk for three reasons.

They are ephemeral. A container may live for sixty seconds. A serverless inference function may live for three. Seconds. There is no time to install an agent, baseline behavior, or run an identity governance evaluation cycle. The defender’s traditional toolset cannot reach them.

They are highly privileged. AI agents need broad cross-service access to do their jobs. They call other services. They reach data stores. They communicate outbound at unprecedented scale. When a workload that holds an over-scoped non-human identity is compromised, the attacker inherits its identity by construction, and that identity can often be exfiltrated and replayed elsewhere. Machine identities now outnumber human identities by 144 to 1 in the average enterprise cloud, a ratio that has grown 56% in the last twelve months. In advanced cloud-native environments, Sysdig reports it reaches 40,000 to 1.

Finally, AI solutions tend to be are rapidly shipped. Every enterprise is racing an AI roadmap. Every business unit is standing up RAG pipelines, agentic frameworks, MCP-connected tools, and inference endpoints. The maturity gap between deployment velocity and security review is the widest it has been in twenty years.

The world is over-rotated on AI. The greatest concentration of risk is also AI. The architecture that defends AI workloads is the architecture that defends cloud workloads generally. Containment is not AI security. It is the architecture AI workloads happen to need most urgently, because they sit at the intersection of all three legs of the toxic combination.

The Math Has Already Decided

For fifteen years, the cybersecurity industry spent more than 200 billion dollars building tools to answer one question. Can we detect the attack before it causes damage? That question assumed the attack would look different from normal work. The toxic combination broke that assumption.

The Vulnerability Deficit Equation, developed in our foundational research and applied to the priority debate in Paper 4 of the public series, formalizes why detection and remediation alone cannot close the gap. The stock of exploitable, unpatched vulnerabilities in any environment grows with discovery and with new code, and shrinks with effective remediation, minus the small fraction of patches that introduce new defects, against an unpatchable surface of misconfigurations and architectural design choices that no scanner can fix.

V(t) = V(t-1) + D(t, C(t)) - R_eff(t) + f(R(t)) + M(t)

Every force driving discovery is compounding, and AI is the accelerant. The global codebase is expanding at an accelerating rate. Modern applications are 80 to 90% third-party code. Dependency chains amplify every flaw across thousands of applications. Every force constraining remediation is linear and bounded by organizational execution. The CISA and Qualys data of 1.1 billion remediation records across more than 10,000 organizations is the empirical proof. A 6.5x increase in effort produced worse outcomes, not better. The ceiling is not theoretical.

For the vulnerabilities that actually decide breaches, the foundational sequence of vulnerability management, discover, disclose, patch, deploy, is broken at step one. CISA KEV data shows a median time-to-exploit of negative seven days. The catalog records exploitation a week after attackers have already operationalized it.

And vulnerabilities are not the dominant vector. 82% of intrusions in 2026 ride valid credentials through legitimate channels. Compromised credentials. Rogue employees. Negligent employees. SaaS supply chain trust handed to an attacker by a third party your vendor uses. Vulnerability management cannot reach any of this. There is no scanner for a phished employee, an insider recruited over Telegram, or a Context.AI to Vercel OAuth pivot. The vector that decides most breaches is structurally outside the scope of patching, scanning, and detection.

Translation. You cannot patch fast enough to make patching the primary defense. You cannot detect what looks legitimate. The only remaining lever is to govern every workload communication path, bounding the blast radius before the next compromise occurs.

I do not say this lightly. I spent most of my career building the detection era at Splunk. The argument I am making runs against my own history. The math is the math, and the people who wrote the SANS Mythos Report deserve a rigorous response. Paper 4 in our series has an Honest Boundary section that separates what the math proves from what is predicted. The structural claim, that containment must be elevated above continuous patching as a priority, rests entirely on what is proven. That is the standard the Mythos authors set, and it is the standard I want to meet.

The Fork

The dominant industry response to the toxic combination has been Chokepoint Security, routing traffic through a centralized inspection appliance and inspecting at the bottleneck. That model worked in the data center. In cloud, it fails by construction. Kubernetes pods egress through the node’s NAT gateway. Serverless functions egress through the provider’s native NAT. East-west traffic moves directly between VPCs through peering. The chokepoint never sees the traffic that determines the outcome.

This is The Fork. One path leads to more tools, more dashboards, more scanning, and the same structural deficit. The other leads to containment, architectural enforcement at the workload, on every path the workload can take.

From the Detection Era. Can we detect the attack before it causes damage? Posture scanning, vulnerability management, centralized inspection. Blast radius is unbounded, and you discover it after the fact.

To the Containment Era. How far can the attacker reach when detection misses? Communication Governance at every workload. Blast radius is architecturally bounded before the attacker ever lands.

The Containment Era

The Containment Era does not replace detection. It completes it.

The Perimeter Era from 1995 to 2010 placed enforcement at the network edge. Cloud dissolved that boundary. The Detection Era from 2010 to 2026 moved enforcement inside, on the thesis that you find threats fast once they are in. The toxic combination has broken that model. In the Containment Era, enforcement moves to every workload. The operating assumption is that something in your environment is compromised right now. The question that matters is what the blast radius is when it runs.

Containment is the architectural enforcement of explicit communication policy at every workload, governing what it can and can't reach, at the granularity of workload identity and protocol, on every path available to it, independent of whether a compromise has been detected.

Every clause is load-bearing. Enforcement at every workload, not at a centralized appliance. Identity at L7, not IP addresses. Every path governed, including the paths that bypass centralized inspection. Always on, independent of detection. This is Workload Containment, the architectural state that bounds what incident response needs to clean up. When every workload has explicit communication boundaries, every communication outside those boundaries is anomalous by definition. Contain first. Detect within the contained space. Eliminate with precision.

Containment is the only control that holds equally against all three legs of the toxic combination. It does not care whether the workload was compromised through an industrialized supply chain attack, an AI-discovered zero-day, or a phished credential. It cares what the workload can reach.

What Comes Next

The Containment Era demands an architectural solution. Enforcement embedded in the cloud fabric. Operating inline at every workload. Propagating a single policy across every cloud provider in subseconds. Governing blast radius whether or not detection has fired.

That is what the Cloud Native Security Fabric delivers, the Containment Platform the era requires.

We have published a four-paper series that defines the era, the architecture it requires, and the evidence behind both.

Paper 1: The Containment Era. Why the threat model outgrew the security architecture.
Paper 2: The Containment Platform. The formal definition of containment and the five testable properties any architecture must deliver.
Paper 3: 144 to 1. Why every workload in your cloud is already exposed, with the full LiteLLM case study.
Paper 4: The Priority Inversion. The Vulnerability Deficit Equation and the mathematical evidence that the SANS Mythos Report has the priority order wrong.

Read the papers. Stress-test the equation. Tell me where I am wrong.

Take a free Workload Attack Path Assessment to find the hidden attack paths between your workloads.

Explore the Cloud Threat Command Center, free for every CISO, board-ready report in five minutes.

Read how a Fortune Global 500 enterprise stopped the LiteLLM attack.

One question for every CISO reading this. If a valid credential is used against one of your AI workloads at 3:00 AM on a Sunday, what, architecturally, prevents it from reaching the ledger? If the answer depends on a detection tool firing first, you do not have containment. You have hope.

Frequently Asked Questions

The Containment Era is the next stage of cloud security following the Detection Era. The Detection Era was all about detecting threats and stopping them before they entered the network; the Containment Era is about communication governance and controlling what workloads can reach so security teams can limit blast radius. Learn more here.

AI workloads concentrate risk in three ways. First, they are ephemeral, often lasting only seconds, which means traditional security agents can't be installed or baselined in time. Second, they are highly privileged, with broad access to other services and data stores. Machine identities now outnumber human identities 144 to 1 in average enterprise clouds. Third, they are shipped fast. Every company is racing to deploy AI, and security reviews can't keep pace with deployment speed. This combination makes AI workloads the ideal entry point for attackers.

The Vulnerability Deficit Equation is a formula that shows why patching and remediation alone can't close the security gap. It calculates how the stock of exploitable, unpatched vulnerabilities in any environment changes over time. New vulnerabilities are constantly added through discovery, new code, and expanding dependency chains, and AI is accelerating that growth. Meanwhile, remediation capacity is linear and bounded by organizational execution. Even some patches introduce new defects, and misconfigurations can't be patched at all. Analysis of 1.1 billion remediation records confirmed the math: a 6.5x increase in effort produced worse outcomes, not better.

The data shows that a 6.5x increase in remediation effort across 10,000+ organizations actually produced worse outcomes. Critical vulnerabilities unresolved after seven days rose from 56% to 63%. Attackers now exploit vulnerabilities a median of seven days before they even appear in official catalogs. On top of that, 82% of intrusions use valid credentials, a vector that no scanner or patch can address. The math shows that discovery of new vulnerabilities is compounding while remediation capacity remains linear and bounded.

Traditional approaches route traffic through a centralized inspection point. In cloud environments, this fails because Kubernetes pods, serverless functions, and east-west traffic between VPCs bypass that chokepoint entirely. Workload Containment enforces explicit communication policies at every individual workload, governing what each one can reach and what can reach it. It operates based on workload identity and protocol rather than IP addresses. This means that even if an attacker compromises a workload, the blast radius is architecturally bounded before any detection tool needs to fire.

Doug Merritt

CEO

Doug Merritt is Chairman, Chief Executive Officer, and President of Aviatrix. Most recently, Doug served as Splunk President and CEO from 2015 to 2021. During his tenure as CEO, Doug led the transformation of Splunk from an on premise, perpetual license software company with the equivalent of $220 million in Annual Recurring Revenue (ARR), to a cloud-based SaaS company with ARR of $3.12 billion.

Read Full Bio