Simanaitis Says

On cars, old, new and future; science & technology; vintage airplanes, computer flight simulation of them; Sherlockiana; our English language; travel; and other stuff

GOOGLING—FOR FUN AND PROFIT PART 2

YESTERDAY WE BEGAN GLEANING TIDBITS from Donald MacKenzie’s “The Future of Search,” London Review of Books, November 20, 2025. Today in Part 2 he (and we) continue analyzing implications of crawling through incredibly huge amounts of the world’s websites.

What About All These Data? Donald MacKenzie observes, “All of this involved Google in creating, collecting and assembling a historically unprecedented quantity of data. (The US National Security Agency might previously have come close, though we don’t really know, and Facebook was soon to accomplish something similar…. You could therefore be forgiven for assuming that the legal troubles currently besetting Google are to do with its misuse of this data. On balance, however, Google seems to have behaved fairly responsibly in this respect. Its legal difficulties concern instead whether it plays too dominant a role in advertising markets.”

Controlling the Ad Market? “In two lawsuits,” MacKenzie describes, “the US Department of Justice and several US states have accused Google of monopolising two different areas of advertising.”

MacKenzie describes, “The first case concerns the markets for ‘general search’…. The DoJ had little difficulty in establishing that, in the words of the court’s judgment in August 2024, Google possesses ‘a dominant market share’ in general search: In the US, 89.2 per cent overall, ‘which increases to 94.9 per cent on mobile devices.’ ”

I’m part of this: I’ve tried Safari and Bing, yet return to Google.

 “The second competition law case… is more esoteric,” says MacKenzie, “in that it concerns the inner workings of digital advertising. Two systems are at issue. The first is Google’s ‘ad server.’ This is a cloud service that Google sells to publishers (not just news publishers but providers of online content of all kinds); it takes the final decision about which ads to show users when they visit the publisher’s website. The second is Google’s ad exchange, AdX, which conducts ad trading in real time.”

Image from Props.

MacKenzie offers a point to consider: “The court conceded that the way Google has acted is in some respects quite different from a traditional monopolist: for example, it has not raised the fees it charges publishers for use of its ad server.” 

That is, it’s in marked contrast to the business abuses that brought about the 1890 Sherman Antitrust Act.

Image by Edmon de Hero from The New York Times.

It Ain’t Over Yet…. “The appeals process began in the summer,” MacKenzie observes, “and Google’s lawyers will, no doubt, continue to argue, among other things, that Google has default position because it is the best search engine.”

Hear! Hear! (Though I realize such exhortation is extrajudicial.) 

Transformer, LLMs, and Chatbots. MacKenzie introduces, “In August 2017, a machine-learning researcher called Jakob Uszkoreit uploaded to Google’s research blog a post about a new architecture for neural networks that he and his colleagues called the Transformer.”

My A.I. Overview: MacKenzie goes into more detail, but my Google A.I. Overview describes succinctly, “Transformer is a deep-learning neural network architecture that excels at sequence-to-sequence tasks by using a self-attention mechanism to weigh the importance of different parts of the input sequence…. Unlike older recurrent neural networks, transformers can process data in parallel, which significantly speeds up training.” 

This last point about training is a big deal.

GPT Etymology. MacKenzie recounts, “It was OpenAI, not Google, that made the most decisive use of the Transformer. Its debt is right there in the name: OpenAI’s evolving LLMs are all called GPT, or Generative Pre-trained Transformer. GPT-I and GPT-2 weren’t hugely impressive; the breakthrough came in 2020 with the much larger GPT-3.” 

Google’s Worry. “For Google,” MacKenzie says, “the big worry is what will happen to search in the long term. Most search queries can easily be rephrased as a prompt for a chatbot, and that is a clear threat to what has been, for a quarter of a century, Google’s most important, and still very healthy, source of revenue.”

Marketing to Whom? Or to What?  MacKenzie observes, “Several AI companies are developing automated purchasing assistants on top of LLMs, and I am already starting to read articles in the trade press about how to market products to such assistants rather than directly to human beings.”

Just another lamentable step in removing humans from the process. 

Conclusion: MacKenzie foresees a trajectory that’s “unsustainable, and not just environmentally: It’s getting harder to find adequate volumes of fresh data on which to train new models, since much of what exists is already potentially compromised by having been generated by previous models. It’s going to be a hard road to cross. Can we navigate it successfully? Can Google? If it turns out to be too tired to make it, I’ll be a little sad.”

Me too. ds

© Dennis Simanaitis, SimanaitisSays.com, 2025

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.