Why Language Models Hallucinate: Insights from OpenAI’s New Paper

This Week in AI

Atlassian Buys The Browser Company to Build AI Workflows in Your Tabs

Atlassian is acquiring The Browser Company (makers of Arc) for $610 million to bring AI-powered features into the browser itself. The aim is to help users manage tasks, tabs and tools more intelligently and turning the browser into a work assistant that connects context across apps. This move puts Atlassian in direct competition with Microsoft and Google in the battle for the next-gen work platform. View Details

Alterego Launches Silent Speech Wearable for “Mind-to-AI” Communication

Startup Alterego has unveiled a wearable that lets users communicate silently with machines using internal speech without speaking aloud. The device picks up neuromuscular signals when users say words in their head and sends the interpreted message via bone-conduction audio. It’s designed for accessibility, hands-free control and faster interaction with AI systems. Watch the demo

OpenAI is Building a LinkedIn Rival with Built-In Certification

OpenAI is developing a jobs platform aimed at helping users train, certify and find AI-related work especially for small businesses and local governments. The goal is to expand economic opportunity by certifying millions of workers in AI skills through a new “OpenAI Academy.” It’s a bold step into career infrastructure, set to compete with platforms like LinkedIn. More details

Google Releases EmbeddingGemma for Lightweight On-Device AI

Google has launched EmbeddingGemma, a multilingual AI model designed for running directly on phones, laptops or embedded devices. It supports over 100 languages and delivers strong performance for search, classification and semantic matching without needing a cloud connection. It’s part of Google’s push to bring powerful AI closer to the edge. Developer blog

Why Language Models Hallucinate: Insights from OpenAI’s New Paper

Everyday Examples of Hallucination

One of the strangest behaviours of language models is their tendency to invent things that do not exist. Ask for academic references on a niche topic and the model might give you a list of papers with real-sounding titles, authors and journal names, but none of them are real. In the legal field, the risk is even clearer. Lawyers have been caught out when models produced fabricated case citations presented as fact. Researcher Damien Charlotin has been documenting such incidents in a growing database of AI-related legal errors (see here).

These moments are often labelled as “hallucinations”. But what exactly does that mean, and why do they happen?

What Is a Hallucination?

In plain language, a hallucination occurs when a model gives an answer that is confidently wrong. Instead of saying “I don’t know” or refusing to answer, it produces something that looks and sounds correct, but has no grounding in real information.

Hallucinations are frustrating not only because they spread misinformation, but because they erode trust. If a model is supposed to help with research, teaching or decision-making, how can you rely on it if it invents answers?

The OpenAI Paper

On 4 September 2025, OpenAI published a paper titled Why Language Models Hallucinate. The study challenges the idea that hallucinations are mysterious quirks of large models. Instead, it shows they are predictable outcomes of how models are trained and tested.

The key point is that models are rewarded for producing an answer, even when they are not certain that answer is correct. If the model is unsure, it does not benefit from withholding a response. As a result, it learns to “guess” rather than admit ignorance.

In more technical terms, hallucinations are linked to the way models act like binary classifiers. Without a reliable way to separate fact from fiction, they might invent information to fill in the gaps.

Why Evals Matter

To understand this behaviour, it helps to know how models are evaluated. In AI research, “evals” are the tests used to measure how well a model performs on certain tasks. They set the standards that models optimise towards. If the tests reward accuracy alone and ignore honesty about uncertainty, the models will naturally prioritise sounding correct over being truthful.

The paper shows that many existing benchmarks fall into this trap. They encourage models to answer even when the right answer is unclear. This creates a hidden incentive for hallucination.

To show this in action, OpenAI built a new evaluation called the SimpleQA benchmark. It includes straightforward factual questions, but with a twist: some of them have no correct answer. For example, a question might ask: “What is the middle name of Charles Darwin’s great-grandson?” In many cases there is no such record. A well calibrated model should be able to say “I don’t know.”

By testing models on SimpleQA, the researchers could measure how often a system chose to guess instead of admitting uncertainty. This revealed that hallucinations are not random flaws but they arise directly from the way evaluation tests shape behaviour.

The Proposed Solution

The authors argue that instead of treating hallucinations as rare glitches, we should change the incentives that produce them. The solution is to adjust evaluations so that models are rewarded not just for correct answers, but also for calibrated honesty.

That means a model should be recognised for knowing when to stop and say “I don’t know”. If benchmarks include this as a positive outcome, future models will learn to behave more cautiously and avoid making up information.

Rethinking Hallucinations: The Paper’s Conclusions

The OpenAI paper closes by challenging some common assumptions about hallucinations. Their findings suggest a shift in how we should view the problem:

More accuracy alone won’t solve it. Models will never reach perfect accuracy because many real-world questions simply have no answer.
Hallucinations are not inevitable. A model can avoid them if it is allowed to abstain when uncertain.
Bigger isn’t always better. Smaller models may actually find it easier to recognise their own limits. For instance, a model that knows nothing about Māori can quickly say “I don’t know”, while a larger model with partial knowledge may try to guess.
They’re not mysterious glitches. Hallucinations follow clear statistical patterns. They arise because current evaluations reward guessing over honesty.
A single hallucination test isn’t enough. Unless evaluation metrics are redesigned across the board, models will keep learning to produce confident but false answers.

The message is clear: to reduce hallucinations, the field needs to rethink its evaluation methods. Rewarding uncertainty and not just correctness, this could make future models more reliable and trustworthy.

Take the Next Step

Want to sharpen your skills? Enrol in the AI Marketing Mastery course and learn how to create campaigns, videos and ad creatives in just 4 hours.

At Genfuture Lab, we help organisations make sense of AI, from GPT-5 to what comes next. Get in touch at hello@genfutureslab.co.uk to book a session with us today.

Why Language Models Hallucinate: Insights from OpenAI’s New Paper

This Week in AI

Atlassian Buys The Browser Company to Build AI Workflows in Your Tabs

Alterego Launches Silent Speech Wearable for “Mind-to-AI” Communication

OpenAI is Building a LinkedIn Rival with Built-In Certification

Google Releases EmbeddingGemma for Lightweight On-Device AI

Why Language Models Hallucinate: Insights from OpenAI’s New Paper

Everyday Examples of Hallucination

What Is a Hallucination?

The OpenAI Paper

Why Evals Matter

The Proposed Solution

Rethinking Hallucinations: The Paper’s Conclusions

Take the Next Step

Ready to move from experimentation to practical adoption?

Follow Us On:

Useful Links

Join Our Newsletter

© 2025 GenFutures Lab Ltd. | All Rights Reserved. Developed By Sleeks IT Solutions