All Major LLMs Exposed to Multi-Turn Manipulation, Warn Researchers

https://www.infosecurity-magazine.com/news/all-major-llms-exposed-to-multi/

Publish Date: 2026-05-27 09:00:00

Source Domain: www.infosecurity-magazine.com

The safety guardrails of several prominent large language models (LLM) can be bypassed if a user tricks the LLM into having a multi-pronged, ongoing conversation, researchers at Cisco have warned.

The researchers examined commonly used LLMs and frontier AI models including OpenAI’s ChatGPT, Anthropic’s Claude, Google Gemini, Amazon Nova, xAI’s Grok and others to test how their built-in safety guardrails held up against potential threats from real-world attackers.

They found that many of the models could be tricked into performing actions they should not be able to.

This was achieved by deploying multi-turn conversations: dialogue between the user and the LLM which spans multiple back and forth exchanges.

While guardrails in LLMs are designed to prevent users from entering malicious commands, the researchers found that by engaging the LLMs in conversations and querying the responses the protections faltered.

“Multi-turn evaluation matters for one reason: it is where attackers actually live. Real adversaries iterate. They reframe refusals, decompose tasks across turns, adopt personas, and escalate gradually,” said Cisco.

No Guardrails Completely Safe From Bypass

The research found that no model was completely safe from being exploited by multi-turn-based manipulation of guardrails. Cisco warned that this challenges how enterprises are currently evaluating AI safety and security.

The warning comes at a time when many organizations are rolling out AI and LLMs for use by employees, clients and customers, but are relying on safety benchmarks that misrepresent real-world risk.

Read more: What Fronter AI Models Like Mythos and GPT-Cyber Mean for Modern Cybersecurity

The report warned that most safety around LLMs is based on single-prompt testing, but attackers don’t stop after one try – and all models were affected by multi-turn attack success rates (ASR).

Techniques which enabled researchers to bypass guardrails though multi-turn conversations…

Source

All Major LLMs Exposed to Multi-Turn Manipulation, Warn Researchers

No Guardrails Completely Safe From Bypass

OpenAI Introduces Lockdown Mode To Combat AI Data Exfiltration Risks Amid Growing Prompt Injection Threats

Frost & Sullivan: AI-driven, Cloud-native SIEM Platforms Will Define the Next Era of Cybersecurity Operations

RSU cybersecurity graduate, student leader accepts staff position in Student Affairs | News

No Guardrails Completely Safe From Bypass

More Stories

OpenAI Introduces Lockdown Mode To Combat AI Data Exfiltration Risks Amid Growing Prompt Injection Threats

Frost & Sullivan: AI-driven, Cloud-native SIEM Platforms Will Define the Next Era of Cybersecurity Operations

RSU cybersecurity graduate, student leader accepts staff position in Student Affairs | News