Grok 4 is out
Take a look at Old Ma's tweet from yesterday, it's quite interesting.
The general idea is: they took a bunch of AI models and subjected them to a test called ARC-AGI.
You can think of it as a reading comprehension exam, but the questions are not aimed at humans; the test takers are various AI software.
The graphic provides two dimensions: one is the model's score (the higher, the smarter), and the other is the cost (how much it costs to complete a task, the cheaper the better).
As a result, Grok 4 achieved the highest score and at a low cost. It surpassed OpenAI's GPT-4.5, Anthropic's Claude Opus, Google's Gemini, and its own previous version, Grok 3.
The point I want to make is not about which rank Grok achieved, but to express that the development of AI is indeed accelerating, and it's already considering who is both smart and cost-efficient. #AI