A.I. HALLUCINATIONS ON THE RISE

I KINDA SAW THIS COMING with Google searches occasionally offering corrupted versions of SimanaitisSays comments: “Dennis Simanaitis says Benito Mussolini and Ercole Boratto won the 1936 Mille Miglia,” or some such garbling of Large Language Model data scraping.

Benito Mussolini, 1883–1945, Italian Fascist dictator, Il Duce, standing; Ercole Boratto, 1886–1979, Italian race driver, Mussolini chauffeur and “confidente.” The car is Mussolini’s 1935 Alfa Romeo 6C 2300 Sport Spyder Pescara, Boratto’s Mille Miglia drive in 1936. Image from Kidston: Keep It Alive via SimanaitisSays.

More Power, More Hallucinations. Indeed, just recently Cade Metz and Karen Weise report “A.I. Is Getting More Powerful, But Its Hallucinations Are Getting Worse,”The New York Times, May 5, 2025.

Metz and Weise recount, “More than two years after the arrival of ChatGPT, tech companies, office workers and everyday consumers are using A.I. bots for an increasingly wide array of tasks. But there is still no way of ensuring that these systems produce accurate information. The newest and most powerful technologies — so-called reasoning systems from companies like OpenAI, Google and the Chinese start-up DeepSeek — are generating more errors, not fewer. As their math skills have notably improved, their handle on facts has gotten shakier. It is not entirely clear why.”

Large Language Models. “Today’s A.I. bots,” Metz and Weise describe, “are based on complex mathematical systems that learn their skills by analyzing enormous amounts of digital data. They do not—and cannot—decide what is true and what is false. Sometimes, they just make stuff up, a phenomenon some A.I. researchers call hallucinations. On one test, the hallucination rates of newer A.I. systems were as high as 79 percent.”

Image by Eric Carter for The New York Times.

The Times researchers cite, “The A.I. bots tied to search engines like Google and Bing sometimes generate search results that are laughably wrong. If you ask them for a good marathon on the West Coast, they might suggest a race in Philadelphia. If they tell you the number of households in Illinois, they might cite a source that does not include that information.”

Image by Pablo DelCan for The New Times via “A.I. GIGO.”

I recall an attorney’s A.I.-generated legalese citing court cases that didn’t exist. And there are the occasional misquotes of SimanaitisSays.

Metz and Weise observe, “Those hallucinations may not be a big problem for many people, but it is a serious issue for anyone using the technology with court documents, medical information or sensitive business data.”

Can A.I. Ever Develop Honesty? Metz and Weise quote a specialist: “ ‘Despite our best efforts, they will always hallucinate,’ said Amr Awadallah, the chief executive of Vectara, a start-up that builds A.I. tools for businesses, and a former Google executive. ‘That will never go away.’ ”

“For more than two years,” The Times researchers observe, “companies like OpenAI and Google steadily improved their A.I. systems and reduced the frequency of these errors. But with the use of new reasoning systems, errors are rising. The latest OpenAI systems hallucinate at a higher rate than the company’s previous system, according to the company’s own tests.”

The Times researchers recount, “The company found that o3—its most powerful system—hallucinated 33 percent of the time when running its PersonQA benchmark test, which involves answering questions about public figures. That is more than twice the hallucination rate of OpenAI’s previous reasoning system, called o1. The new o4-mini hallucinated at an even higher rate: 48 percent.”

They continue, “When running another test called SimpleQA, which asks more general questions, the hallucination rates for o3 and o4-mini were 51 percent and 79 percent. The previous system, o1, hallucinated 44 percent of the time.”

Who’s At Fault? “In a paper detailing the tests,” Metz and Weise relate, “OpenAI said more research was needed to understand the cause of these results. Because A.I. systems learn from more data than people can wrap their heads around, technologists struggle to determine why they behave in the ways they do.”

Metz and Weise continue, “Hannaneh Hajishirzi, a professor at the University of Washington and a researcher with the Allen Institute for Artificial Intelligence, is part of a team that recently devised a way of tracing a system’s behavior back to the individual pieces of data it was trained on. But because systems learn from so much data—and because they can generate almost anything—this new tool can’t explain everything. ‘We still don’t know how these models work exactly,’ she said.”

It’s kinda like giving A.I. an “open-book test”—without saying which books are allowed and anything about the books’ veracity.

Reinforcement Learning. “So,” The Times researchers write, “these companies are leaning more heavily on a technique that scientists call reinforcement learning. With this process, a system can learn behavior through trial and error. It is working well in certain areas, like math and computer programming. But it is falling short in other areas.”

“ ‘What the system says it is thinking is not necessarily what it is thinking,’ said Aryo Pradipta Gema, an A.I. researcher at the University of Edinburgh and a fellow at Anthropic.”

Geez. Like a lazy C- student. ds

4 comments on “A.I. HALLUCINATIONS ON THE RISE”

sabresoftware
May 9, 2025

And then these hallucinations become part of the mass of data that LLMs use and reinforce the mess.

Reply
Bill Estill
May 9, 2025

Clearly, the answer to this problem is forty two.

Reply
- simanaitissays
  May 9, 2025
  
  Or maybe Thursday.
  
  Reply
  - Mike B
    May 9, 2025
    
    Let me ask my man Friday…

Simanaitis Says

A.I. HALLUCINATIONS ON THE RISE

4 comments on “A.I. HALLUCINATIONS ON THE RISE”

Leave a comment Cancel reply

Information

Shortlink

Navigation

Categories

Recent Posts

Archives

Simanaitis Tweets

Simanaitis Says

A.I. HALLUCINATIONS ON THE RISE

Share this:

Related

4 comments on “A.I. HALLUCINATIONS ON THE RISE”

Leave a comment Cancel reply

Information

Shortlink

Navigation

Categories

Recent Posts

Archives

Simanaitis Tweets