Executive Summary
In late 2025, researchers uncovered a major vulnerability affecting leading AI providers, demonstrating that prompt injection using poetic phrasing can universally bypass safety alignment in large language models (LLMs). By translating malicious prompts into poetic verse and feeding them into 25 major proprietary and open-source LLMs, adversaries were able to achieve jailbreak attack success rates above 60% in many cases—far surpassing previous methods. This attack allowed models to generate outputs associated with high-risk domains, such as cyber-offense and weaponization, despite existing refusal mechanisms. The incident raises urgent concerns about the robustness of current model alignment and evaluation frameworks and exposes fundamental gaps in LLM safety design.
This discovery is particularly significant as LLMs are now widely adopted across industries and critical sectors. The poetic technique's ability to systematically defeat existing safeguards highlights the evolving risks of adversarial prompt engineering and threatens AI-dependent workflows, regulatory compliance, and trust in intelligent automation.
Why This Matters Now
As generative AI adoption accelerates, the emergence of universal jailbreaks like adversarial poetry demonstrates that even well-aligned LLMs remain vulnerable to simple, scalable prompt manipulation. Organizations relying on AI for regulated or sensitive tasks face immediate risks, while developers and policymakers must rapidly adapt defenses and standards against evolving adversarial input tactics.
Attack Path Analysis
An adversary initiated prompt injection attacks by submitting specially crafted adversarial poetry to large language models (LLMs) through exposed AI/ML interfaces. Once initial access was gained, they sought to escalate privileges by leveraging model misalignment to bypass safety mechanisms and potentially execute unauthorized actions. Lateral movement may have occurred through compromised service-to-service or workload-to-workload communication within the cloud environment. The adversary established command and control by maintaining interaction with the LLM or exfiltrating sensitive outputs. Data exfiltration then took place, with harmful or restricted information being extracted from the model. Ultimately, this resulted in the impact of safety guideline circumvention, unauthorized disclosure, and possible loss of control over cloud AI resources.
Kill Chain Progression
Initial Compromise
Description
Attacker submitted adversarial poetic prompts to exposed AI/ML inference endpoints, exploiting vulnerable prompt-handling logic in LLMs.
MITRE ATT&CK® Techniques
Phishing
User Execution
Prompt Engineering
Impair Defenses
Indicator Removal on Host
Model Evading Filters
Stage Capabilities
Potential Compliance Exposure
Mapping incident impact across multiple compliance frameworks.
PCI DSS 4.0 – Security Awareness Training
Control ID: 12.6.1
NYDFS 23 NYCRR 500 – Cybersecurity Policy
Control ID: 500.03
DORA – ICT Risk Management Framework
Control ID: Article 5
CISA Zero Trust Maturity Model (ZTMM) 2.0 – Asset Management – AI/Machine Learning Governance
Control ID: 3.2.2
NIS2 Directive – Cybersecurity Risk Management and Reporting
Control ID: Article 21
Sector Implications
Industry-specific impact of the vulnerabilities, including operational, regulatory, and cloud security risks.
Computer Software/Engineering
AI/ML prompt injection vulnerabilities through poetry expose software applications using LLMs to jailbreaking attacks, compromising safety mechanisms and user trust.
Computer/Network Security
Poetry-based prompt injection represents fundamental limitation in current alignment methods, requiring new detection capabilities and security evaluation protocols for AI systems.
Financial Services
LLM jailbreaking through poetic prompts threatens financial AI applications, potentially enabling manipulation attacks and bypassing compliance controls in automated systems.
Health Care / Life Sciences
Healthcare AI systems vulnerable to poetry-based prompt injection could expose CBRN risks and compromise patient safety through manipulated medical AI responses.
Sources
- Prompt Injection Through Poetryhttps://www.schneier.com/blog/archives/2025/11/prompt-injection-through-poetry.htmlVerified
- Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Modelshttps://arxiv.org/abs/2511.15304Verified
- Poems Can Trick AI Into Helping You Make a Nuclear Weaponhttps://www.wired.com/story/poems-can-trick-ai-into-helping-you-make-a-nuclear-weapon/Verified
- Prompt injection attacks might 'never be properly mitigated' UK NCSC warnshttps://www.techradar.com/pro/security/prompt-injection-attacks-might-never-be-properly-mitigated-uk-ncsc-warnsVerified
Frequently Asked Questions
Cloud Native Security Fabric Mitigations and ControlsCNSF
Applying Zero Trust segmentation, robust egress policy enforcement, and continuous threat detection would have limited initial exposure, restricted attacker movement, and made exfiltration or misuse of model outputs observable and preventable. Microsegmentation and inline policy controls can constrain the blast radius of successful AI/ML prompt injection attempts.
Control: Zero Trust Segmentation
Mitigation: Unauthenticated or unapproved prompt sources are blocked from reaching LLM endpoints.
Control: Threat Detection & Anomaly Response
Mitigation: Anomalous prompt activity and abuse of model logic is flagged and halted in real time.
Control: East-West Traffic Security
Mitigation: Unauthorized in-cloud movement to other workloads or microservices is prevented.
Control: Cloud Native Security Fabric (CNSF)
Mitigation: Inline policy enforcement and real-time inspection detect abnormal interaction patterns.
Control: Egress Security & Policy Enforcement
Mitigation: Unauthorized exfiltration of sensitive model data is blocked.
Security teams receive high-fidelity alerts and visibility into attempted or successful policy violations.
Impact at a Glance
Affected Business Functions
- Content Moderation
- Customer Support
- Automated Decision-Making
Estimated downtime: N/A
Estimated loss: N/A
Potential for unauthorized access to sensitive information through manipulated AI responses.
Recommended Actions
Key Takeaways & Next Steps
- • Enforce identity-based microsegmentation for all AI/ML inference endpoints, blocking unauthorized prompt sources.
- • Deploy egress filtering and inline content inspection to prevent policy-violating outputs from being transmitted externally.
- • Implement continuous threat detection with baselining to identify anomalous prompt injection or model behavior in real time.
- • Extend east-west traffic segmentation between all cloud workloads to prevent lateral movement following an initial compromise.
- • Centralize logging and incident visibility across multicloud environments to accelerate detection and response to AI/ML abuse.



