Some people eat, sleep and chew gum, I do genealogy and write...

Monday, June 23, 2025

Apple study reveals the limitations of current AI Large Reasoning Models

 

Apple Machine Learning Research. “The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity.” Accessed June 23, 2025. https://machinelearning.apple.com/research/illusion-of-thinking.

It is about time that someone or some company did studies to determine the limitation of the currently popular Large Languag Models (LLMs) and AI Large Reasoning Models (LRMs). First, short explanation about LLMs and LRM.
A Large Language Model (LLM) is a type of artificial intelligence (AI) program designed to understand, generate, and manipulate human language. LLMs are characterized by their massive scale—they are trained on enormous datasets of text and code, often comprising billions or even trillions of words. (Google Gemini search)

also: 

A Large Reasoning Model (LRM) is an advanced type of artificial intelligence model that extends the capabilities of Large Language Models (LLMs) by specifically focusing on and enhancing their ability to perform multi-step logical reasoning and problem-solving. (Google Gemini search)

Quoting from an article posted by The Guardian (Marcus, Gary. “When Billion-Dollar AIs Break down over Puzzles a Child Can Do, It’s Time to Rethink the Hype.” The Guardian, June 10, 2025, sec. Opinion. https://www.theguardian.com/commentisfree/2025/jun/10/billion-dollar-ai-puzzle-break-down.)

Apple did this by showing that leading models such as ChatGPT, Claude and Deepseek may “look smart – but when complexity rises, they collapse”. In short, these models are very good at a kind of pattern recognition, but often fail when they encounter novelty that forces them beyond the limits of their training, despite being, as the paper notes, “explicitly designed for reasoning tasks”.

Apple used a child's game called "The Tower of Hanoi" that is relatively easy to solve to show that the reasoning ability of the major LLMs and LRMs fail when faced with challenges that can be solved by young children. 

My own experience using AI for a variety of purposes (such as the ones above) indicate that the advancements for historical/genealogical research are as limited as I expected. However, when we talk about handwriting recognition and the related progress is handling large numbers of documents will fundamentally change the way genealogists do their research.

No comments:

Post a Comment