Updated 8 September 2025 at 15:12 IST
OpenAI Confirms GPT-5 Hallucinations: Here’s Why the AI Gives Confidently Wrong Answers
According to OpenAI, the problem isn’t random. It’s rooted in how AI is trained and evaluated. Models are rewarded for accuracy during testing, but the way tests are designed encourages guessing over admitting uncertainty.
- Tech News
- 3 min read

When the AI boom took off, a new word quickly entered the public conversation - hallucinations. People were puzzled: how can software “hallucinate”? Today, OpenAI itself is clarifying the picture. In a new research paper and blog post, the company admits that even its most advanced model, GPT-5, still produces confidently wrong answers, though less often than before.
The company defines hallucinations as plausible but false statements generated by AI models. These errors can look convincing but turn out to be factually wrong like when a chatbot is asked for an author’s PhD dissertation and confidently invents three different titles, none of which exist.
Why AI hallucinates
According to OpenAI, the problem isn’t random. It’s rooted in how AI is trained and evaluated. Models are rewarded for accuracy during testing, but the way tests are designed encourages guessing over admitting uncertainty.
In OpenAI’s words, “errors are worse than abstentions.” Yet most leaderboards value accuracy above all, punishing models that simply say “I don’t know.” This creates the wrong incentives, where guessing looks like winning.
Advertisement
GPT-5 vs earlier models
The company’s latest evaluation highlights the tradeoff. In one test called SimpleQA, OpenAI compared two systems and found that the older o4-mini model shows higher accuracy but a staggering 75% error rate while the GPT-5-thinking-mini shows lower accuracy at 22%, but far fewer errors, showing it is better at knowing when not to guess.
Why facts are tricky
According to OpenAI, hallucinations also trace back to the way models are trained. Language models learn by predicting the next word in a sentence. This works well for structured patterns like grammar or spelling, but falls apart when dealing with rare or arbitrary facts, such as birthdays or obscure details. Unlike typos, which vanish as models scale up, factual hallucinations are baked into the process.
Advertisement
OpenAI’s proposed fix
OpenAI suggests a simple but crucial shift: update evaluation scoreboards so that confident wrong answers are penalized more than honest uncertainty. In practice, this means rewarding models for saying “I don’t know” rather than making things up.
The company pushes back on common myths too. Hallucinations aren’t an unsolvable glitch, and they don’t always require bigger models to fix. Sometimes, smaller models that “know their limits” actually handle uncertainty better.
The bigger picture
For everyday users, the message is clear: GPT-5 hallucinates less than its predecessors, but it isn’t immune. OpenAI acknowledges the issue is fundamental to AI design and insists the path forward is about recalibrating both training methods and how the industry measures success.
Read More: iPhone 17 Battery Capacities Leaked, Pro Max Breaks 5,000mAh Barrier
Published By : Priya Pathak
Published On: 8 September 2025 at 15:12 IST