Executive Summary
In late 2025, cybersecurity researchers discovered critical remote code execution vulnerabilities in leading AI inference frameworks developed by Meta, Nvidia, and Microsoft, as well as popular open-source projects including PyTorch, vLLM, and SGLang. The flaws stem from unsafe implementations of the ZeroMQ (ZMQ) messaging library and insecure Python pickle deserialization processes, enabling attackers to exploit affected models and potentially execute malicious commands on targeted systems. The exposure threatens AI infrastructure across major cloud and hybrid environments, raising concerns about data integrity and confidentiality for enterprises deploying advanced machine learning workloads.
This incident underscores a growing trend of supply-chain vulnerabilities hijacking foundational AI technologies, with attackers increasingly targeting interdependent machine learning frameworks. Heightened regulatory pressure and intensified focus on software supply-chain security emphasize the urgent need for improved cryptographic practices and zero trust segmentation in AI environments.
Why This Matters Now
With the rapid adoption of generative AI and machine learning across industries, vulnerabilities in core inference frameworks represent a high-severity, supply-chain risk. The urgent need to address insecure serialization and communication channels in widely-used AI infrastructure has become critical as threat actors shift to targeting these emerging attack surfaces.
Attack Path Analysis
Attackers exploited unsafe deserialization in ZeroMQ-based AI inference frameworks to achieve remote code execution and initial access. Leveraging the compromise, they sought to escalate privileges on the targeted system or container. With elevated access, adversaries attempted lateral movement across internal cloud resources and Kubernetes workloads. Attackers established command and control by tunneling outbound communication with covert channels, possibly over unmonitored egress paths. Sensitive data from AI models or underlying infrastructure was exfiltrated through allowed egress routes. Finally, the attack could result in data manipulation or service disruption, affecting AI model integrity and business operations.
Kill Chain Progression
Initial Compromise
Description
Attackers exploited unsafe ZeroMQ and Python pickle deserialization vulnerabilities in exposed AI inference services to gain remote code execution.
Related CVEs
CVE-2024-50050
CVSS 6.3A deserialization vulnerability in Meta's Llama Stack allows remote code execution via untrusted data deserialization.
Affected Products:
Meta Llama Stack – < 0.0.41
Exploit Status:
no public exploitCVE-2025-30165
CVSS 8vLLM's use of ZeroMQ with pickle deserialization allows remote code execution in multi-node deployments.
Affected Products:
vLLM vLLM – < 0.8.0
Exploit Status:
no public exploitCVE-2025-23254
CVSS 8.8NVIDIA TensorRT-LLM's IPC implementation uses pickle over unsecured channels, allowing local attackers to execute arbitrary code.
Affected Products:
NVIDIA TensorRT-LLM – < 0.18.2
Exploit Status:
no public exploitCVE-2025-32444
CVSS 10vLLM's Mooncake integration uses pickle over unsecured ZeroMQ sockets, allowing remote code execution.
Affected Products:
vLLM vLLM – 0.6.5 to 0.8.5
Exploit Status:
no public exploitCVE-2025-29783
CVSS 10vLLM's Mooncake component deserializes untrusted data using pickle, leading to remote code execution.
Affected Products:
vLLM vLLM – < 0.8.0
Exploit Status:
no public exploit
MITRE ATT&CK® Techniques
Exploit Public-Facing Application
Command and Scripting Interpreter
User Execution
Exfiltration Over Alternative Protocol
Credentials from Password Stores
Exploitation of Remote Services
Container Administration Command
Data Manipulation
Potential Compliance Exposure
Mapping incident impact across multiple compliance frameworks.
PCI DSS 4.0 – Address Software Vulnerabilities
Control ID: 6.2.3
NYDFS 23 NYCRR 500 – Cybersecurity Policy
Control ID: 500.03
DORA – ICT Risk Management Framework
Control ID: Article 7(2)
CISA Zero Trust Maturity Model 2.0 – Active Asset Inventory and Software Supply Chain Security
Control ID: Applications Pillar - Asset Management
NIS2 Directive – Cybersecurity Risk Management Measures
Control ID: Article 21
Sector Implications
Industry-specific impact of the vulnerabilities, including operational, regulatory, and cloud security risks.
Computer Software/Engineering
Critical supply-chain vulnerabilities in AI inference frameworks expose software development pipelines to remote code execution through unsafe ZeroMQ and pickle deserialization.
Information Technology/IT
AI infrastructure compromises threaten IT service delivery and cloud deployments, requiring enhanced egress security and threat detection for PyTorch-based systems.
Health Care / Life Sciences
AI inference vulnerabilities risk HIPAA compliance violations in healthcare AI applications, demanding zero trust segmentation and encrypted traffic protection measures.
Financial Services
Banking AI systems face supply-chain attacks targeting inference engines, necessitating multicloud visibility and anomaly detection to prevent regulatory compliance breaches.
Sources
- Researchers Find Serious AI Bugs Exposing Meta, Nvidia, and Microsoft Inference Frameworkshttps://thehackernews.com/2025/11/researchers-find-serious-ai-bugs.htmlVerified
- Critical Security Flaw Identified in Meta’s Llama Framework, Exposing AI Systems to Potential Remote Code Executionhttps://vulnera.com/newswire/critical-security-flaw-identified-in-metas-llama-framework-exposing-ai-systems-to-potential-remote-code-execution/Verified
- Meta's Llama Framework Flaw Exposes AI Systems to Remote Code Execution Riskshttps://www.gov.mn/en/news/all/ac72f994-283f-409d-b5d3-0881aa59dfa9Verified
- NVD - CVE-2025-30165https://nvd.nist.gov/vuln/detail/CVE-2025-30165Verified
- ShadowMQ: Critical Bugs Expose AI Frameworks from Meta, Nvidia, & Microsoft to Remote Code Executionhttps://jnrmr.com/shadowmq-critical-bugs-expose-ai-frameworks-from-meta-nvidia-%26-microsoft-to-remote-code-execution.htmlVerified
- NVIDIA TensorRT-LLM High-Severity Vulnerability Let Attackers Remote Codehttps://cybersecuritynews.com/nvidia-tensorrt-llm-high-severity-vulnerability/Verified
Frequently Asked Questions
Cloud Native Security Fabric Mitigations and ControlsCNSF
Applying Zero Trust segmentation, workload-to-workload isolation, and egress enforcement would curtail an attacker's ability to move laterally, exfiltrate data, and disrupt operations. CNSF controls—especially microsegmentation, runtime visibility, and cloud-native outbound filtering—provide proactive enforcement and early detection at multiple phases of the attack.
Control: Cloud Firewall (ACF)
Mitigation: Reduces attack surface by blocking unauthorized inbound access to AI services.
Control: Kubernetes Security (AKF)
Mitigation: Constrains privilege escalation potential within pods or namespaces.
Control: Zero Trust Segmentation
Mitigation: Prevents unauthorized lateral movement between cloud workloads.
Control: Egress Security & Policy Enforcement
Mitigation: Detects and blocks unsanctioned outbound C2 traffic from workloads.
Control: Inline IPS (Suricata)
Mitigation: Detects and blocks data exfiltration via known malicious signatures or anomalies.
Enables rapid detection and response to infrastructure tampering or destructive behaviors.
Impact at a Glance
Affected Business Functions
- AI Model Inference
- Data Processing
- Cloud Services
Estimated downtime: 5 days
Estimated loss: $500,000
Potential exposure of sensitive AI models and proprietary data due to remote code execution vulnerabilities.
Recommended Actions
Key Takeaways & Next Steps
- • Enforce zero trust segmentation between AI workloads and all adjacent cloud resources to contain initial compromise and lateral movement.
- • Apply egress policy enforcement at the cloud perimeter and workload level to block unsanctioned outbound communication and data exfiltration.
- • Deploy cloud-native intrusion prevention (such as Suricata IPS) for inline detection of exploitation and exfiltration attempts in real time.
- • Integrate continuous Kubernetes and pod security, including namespace enforcement and pod identity policies, to minimize privilege escalation risks.
- • Establish comprehensive visibility and threat baselining across multi-cloud and hybrid environments to speed detection and response to novel attack patterns.



