Apple heads into its annual Worldwide Developers Conference (WWDC) beginning Monday with little to no progress in artificial intelligence, with struggles to meet expectations set by its tech rivals. Yet, the iPhone manufacturer claims that large language models are “failing” because they are focused more on benchmarks than solving problems.

Over the weekend, a research paper circulated on social media from Apple’s AI research division that “downplayed” the capabilities of reasoning models developed by OpenAI, Google DeepMind, Anthropic, and DeepSeek. 

According to the paper, these models have declined in accuracy against the backdrop of an increase in task complexity, ultimately reaching a “point of complete failure.”

“Existing evaluations predominantly focus on established mathematical and coding benchmarks, which, while valuable, often suffer from data contamination issues and do not allow for controlled experimental conditions across different settings and complexities. These evaluations do not provide insights into the structure and quality of reasoning traces,” it read.

AI is failing when problems are harder

Using custom-designed puzzles with controlled levels of complexity, Apple researchers observed that large AI models failed to keep up performances and exerted less effort as problems grew harder. 

The analysts, who measured the reduction by fewer inference-time tokens used during response generation, called the AI situation a “collapse.”

The models tested included OpenAI’s o3-mini variant and Anthropic’s Claude 3.7 Sonnet. The o3-mini models performed “poorly,” while Claude models were slightly resilient. 

Even when provided with the correct algorithm for solving the Tower of Hanoi puzzle, the models did not improve their performance. Apple’s researchers concluded that these AI systems may not be as advanced in reasoning as commonly assumed..

WWDC kicks off pending any product announcement buzz

In previous WWDC events, Apple used the conference to unveil new products, like the Vision Pro headset in 2022 and its Apple Intelligence initiative in 2023. In this year’s conference, market watchers are convinced there is nothing to look forward to.

Bloomberg has previewed the WWDC schedule, coining the updates as “underwhelming.” Moreover, much of what Apple revealed last year is still unavailable to users. The publication explained that Apple’s AI announcements this week would likely be minor and insufficient to impress an industry now led by Google, Meta, OpenAI, and other AI-first firms.

Apple’s stock has dropped over 18% in 2025, and CEO Tim Cook has been blasted about the company’s new products and AI integration pipeline. 

Gene Munster, managing partner at Deepwater Asset Management, told CNN earlier this year that “it’s becoming clearer how far behind Apple are in AI.”

Dan Ives of Wedbush Securities estimates that 25% of the global population could eventually access AI through Apple devices, but as of now, that “potential is unrealized.”

In its earnings call last month, CEO Tim Cook admitted that there were delays in rolling out improved AI features such as a more personal version of Siri. 

“We need more time to complete our work on these features so they meet our high quality bar,” Cook said. “We are making progress, and we look forward to getting these features into customers’ hands.”

Meanwhile, Apple’s competitors are dipping into the tech giant’s clientele, looking to steal loyalists who have been expecting Cook’s team to deliver something “better.” 

Samsung is reportedly partnering with AI startup Perplexity to integrate AI-enhanced digital assistants into upcoming Galaxy phones. Motorola’s new Razr also includes Perplexity-powered features, among other AI integrations.

Apple’s delay in launching AI-powered experiences in its ecosystem puts it at risk of falling behind in a market it once dominated.

Cryptopolitan Academy: Tired of market swings? Learn how DeFi can help you build steady passive income. Register Now