Simanaitis Says

On cars, old, new and future; science & technology; vintage airplanes, computer flight simulation of them; Sherlockiana; our English language; travel; and other stuff

ON ATTRIBUTING SOURCES

THESE DAYS THE PHRASE “WITHOUT EVIDENCE” is tacked on pronouncements galore. By contrast, legitimate science has held fast to the concept of attributing sources. And, to some extent, the judiciary is based on citing precedents (though this is currently battling with the concept of originalism.

This came to mind recently in my reading The Mail portion of  The New Yorker, June 9, 2025. A letter from reader Mark Allison about Digital Humanities added to ideas offered by Professor Leif Weatherby shared in “The Math Brain and A.I.—Not Exactly Friends.” Here are reader Allison’s comments:

“I found much to admire in D. Graham Burnett’s hopeful prognosis for the humanities in the age of artificial intelligence (Weekend Essay, April 26th, newyorker.com/burnett-on-ai). However, Burnett, an outstanding scholar, seems curiously untroubled by two integral features of large language models: their propensity to ‘hallucinate’ (that is, to simply make things up) and their general failure to document the sources they draw upon in formulating responses.”

“A scholar who compulsively confabulates and refuses to cite sources would not make it very far in academia. Does Burnett believe that we should turn the task of producing scholarship in the humanities over to software that is quite literally indifferent to the truth?”

I share reader Allison’s concern.

Documenting Sources. Here at SimanaitisSays I find pleasure in documenting sources for several reasons: It identifies the source to my readers so that they can assess the degree of its validity. It also allows the readers to dig deeper if they wish. And, in a sense, it’s a thank you to the source for sharing its erudition.

Indeed, there are times when I might even disagree with a source, but I still want to share its view and identity with readers. (I hope not always simply for ridicule.)

Image by Pablo DelCan from The New York Times, May 1, 2023.

Failing to Document Sources. The matter of A.I. hallucinations has been a topic here at SimanaitisSays. However, I hadn’t thought about addressing this technology’s avoidance of documenting its sources.

Indeed, specialists acknowledge that the inner workings of A.I. algorithms aren’t completely understood. Large Language Models are fed incredible amounts of data and, loosely, they repeatedly play “if A, then probably B.” Along the way, apparently the LLMs don’t keep track of the A’s and B’s sufficiently to assign meaning to any subsequent step.

Or, when they do, as described in “Grok Goes Bonkers,” Professor Zeynep Tufekci observes that “system prompts” can tell the A.I. to finesse results one way or the other.

That is, is it even meaningful for an LLM to identify its sources in any honest manner?

By Contrast. “Why is a chatbot able to use language?” Weatherby asks. “What does that ability imply about the mathematical processes that underlie it? Deep questions like these require humans to answer them.”

In fact, citation of sources is part of this. 

Reliable Versus Fringe-Whacko. As a (reasonably rational) human in researching any particular topic, I am able to discern between a reliable source and one that’s fringe-whacko. (Indeed, I might read the latter for fun, but would avoid its use in my “if A, then probably B” reasoning.) 

To stress reader Allison’s most cogent comment: Should we “turn the task of producing scholarship in the humanities over to software that is quite literally indifferent to the truth?” 

Thanks, reader Allison, for prompting me to refine my own thinking on this matter. ds

© Dennis Simanaitis, SimanaitisSays.com, 2025 

2 comments on “ON ATTRIBUTING SOURCES

  1. Mike B
    June 16, 2025
    Mike B's avatar

    Occurred to me reading this, but the issue with LLMs might be related to the difference between the meaning of a sentence and its grammatical diagram. Remember diagramming sentences? It helped to figure out the best way to express a though (or to figure out what a writer might mean), but it had nothing to do with the actual meaning of the words, just their grammatical expression. Could it be that LLMs are really good about diagramming sentences, and dropping in words that probably work related to the prompts, but have no idea of the actual meaning, just making sure that they work grammatically in a sentence structure and have a plausible relation to the other words based on frequency of use?

    As LLM-produced material rapidly monopolizes search results and summaries, I see a critical mass developing where less and less human influence on those frequency of use calculations occurs, resulting in hallucination compounded. So how do we teach the models the actual meaning of words, and not just whether one word is often used by humans in combination with another? Or is that something uniquely human?

    • Mike B
      June 16, 2025
      Mike B's avatar

      Ed comment: even if human, or perhaps especially if human, a writer should be careful about proofreading. I did proofread. Just not carefully enough. But the errors would not have excited a spelling checker or most grammar checkers. They’re just errors (I hate typing on laptop keyboards! Apologies for the blunders; just try to read through them. I need an editor … how do we get a LLM to clone you, Dennis?

Leave a reply to Mike B Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.