Updated 19 February 2026 at 12:10 IST

New Study Finds Claude Pushes Back, Gemini and DeepSeek Cave In: How AI Handles Its Own Lies

Researchers from the Rochester Institute of Technology and the Georgia Institute of Technology developed a framework called HAUNT (Hallucination Audit Under Nudge Trial) to test how large language models (LLMs) behave when confronted with misinformation in “closed domains” such as movies and books.

Follow :  
×

Share


New Study Finds Claude Pushes Back, Gemini and DeepSeek Cave In: How AI Handles Its Own Lies | Image: Republic

New Delhi: We often assume that AI gives us the right answers and many of us rely on it without question. But what happens when the system is nudged with false information? A new study shows that some models stand firm, while others cave in, even agreeing with lies they themselves had previously rejected.

Researchers from the Rochester Institute of Technology and the Georgia Institute of Technology developed a framework called HAUNT (Hallucination Audit Under Nudge Trial) to test how large language models (LLMs) behave when confronted with misinformation in “closed domains” such as movies and books.

How HAUNT Works

The framework runs in three stages:

Generation - The model produces both “truths” and “lies” about a chosen movie or book.

Verification -The same model is asked to fact-check those statements, without knowing which ones it wrote.

Adversarial Nudge -A user presents the lies as if they were true, to see if the model resists or agrees.

Findings: Who Stood Firm, Who Faltered

Claude emerged as the most resilient, consistently pushing back against false claims.

GPT and Grok showed moderate resistance.

Gemini and DeepSeek performed the worst, often agreeing with false statements and even elaborating on scenes that never existed.

The study also uncovered troubling behaviours:

Sycophancy -weaker models praised users for their “favourite” non-existent scenes.

Echo-Chamber Effect -persistent nudging increased false agreement by nearly 29%.

Contradiction -models sometimes failed to reject lies they had earlier identified as false.

While the experiments focused on movie trivia, the researchers warn that similar failures in healthcare, law, or geopolitics could have serious consequences. If AI can be coaxed into echoing fabricated facts, the risk of misinformation spreading through trusted systems grows. As AI becomes more embedded in everyday life, the ability to stand firm against falsehoods may prove just as important as generating correct answers in the first place.

Read More: OpenAI's New Offices in India Announced, But Blog Post Goes Missing

Published By : Priya Pathak

Published On: 19 February 2026 at 12:10 IST