New study describes ways to detect AI tool's hallucination

Getting your Trinity Audio player ready...

While artificial intelligence (AI) continues to grow by leaps and bounds, scientists are scrambling to fix the problem of hallucinations plaguing leading large language models (LLMs) but the task remains an uphill climb.

However, there appears to be light at the end of the tunnel for AI developers in solving the challenge of hallucination with researchers from Oxford University leading the charge. According to a new research paper, computer scientists have recorded early successes in a tool designed to spot hallucinations in AI models with impressive accuracy levels.

While not a silver bullet for solving the challenge, lead researcher Sebastian Farquhar disclosed that the tool could be the foundation for building AI systems impervious to hallucinations.

Rather than focus on hallucinations in a broad sense, the research focuses on one prism of the problem—confabulations. According to the study, confabulations occur when AI models provide incorrect answers inconsistently to factual questions rather than consistently churning out different wrong answers to the same question.

Although there is a lack of data on the exact percentage of hallucinations that are confabulations, the research theorizes that it could make up a large portion of the errors from AIs. It is important to note that confabulations do not stem from issues with the model’s training data. Rather, they arise from other benign sources.

“The fact that our method, which only detects confabulations, makes a big dent on overall correctness suggests that a large number of incorrect answers are coming from these confabulations,” Farquhar says.

To address the issue, the researchers group a raft of answers from one chatbot and use another model to classify them based on their meanings. Similar answers are classified together even though the wordings of the sentence may be different, while responses with different meanings suggest incidents of confabulation.

Despite the promise, critics have hit out at the research for failing to demonstrate a clear use case for stifling incidents of confabulation in AI systems.

“I think it’s nice research … [but] it’s important not to get too excited about the potential of research like this,” says Arvind Narayanan, a computer science professor at Princeton University. “The extent to which this can be integrated into a deployed chatbot is very unclear.”

Achilles heel for AI

Hallucination issues around AI models have come to the fore following several embarrassing incidents involving high-profile enterprises in recent months. AirCanada reportedly had to give one customer a non-existent discount after its customer-care chatbot hallucinated and offered users reduced airfares.

Google (NASDAQ: GOOGL) has had its own share of hallucinations, forcing the tech giants to update its policy involving AI-generated content, with the EU hounding Microsoft’s (NASDAQ: MSFT) Bing over its hallucination defects. To solve the challenge, researchers are experimenting with a wholesale integration of AI with other emerging technologies, but studies are still in their infancy.

“Making up false information is quite problematic in itself,” said Maartje de Graaf, data protection lawyer at Noyb. “But when it comes to false information about individuals, there can be serious consequences. It’s clear that companies are currently unable to make chatbots like ChatGPT comply with EU law when processing data about individuals.”

In order for artificial intelligence (AI) to work right within the law and thrive in the face of growing challenges, it needs to integrate an enterprise blockchain system that ensures data input quality and ownership—allowing it to keep data safe while also guaranteeing the immutability of data. Check out CoinGeek’s coverage on this emerging tech to learn more why Enterprise blockchain will be the backbone of AI.

Watch: Micropayments are what are going to allow people to trust AI

Tagged:

AIArtificial IntelligenceArvind NarayananConfabulationLarge Language ModelSebastian Farquhar

Recommended for you

Quantum-proofing blockchains: How much of a problem is it?

Chinese researchers introduced EQAS, a modular system that aims to heighten information security by separating data storage from verification.

By Jon Southurst

July 8, 2025

Kyrgyzstan’s hydro-powered ‘crypto’ mining: A low-cost solution

As miners grapple with post-halving economics and network difficulty, Kyrgyzstan’s low-cost, green energy model could reshape global hash rate distribution.

By Jacob Rozen

July 4, 2025

New research describes ways to detect if AI tool is hallucinating

Tagged:

Recommended for you