Sakana AI Introduces Self-Improving Agent That Boosts Performance By Up To 50% On SWE-bench

Japanese AI company Sakana AI introduced the Darwin Gödel Machine (DGM), a self-modifying agent capable of altering its own code. Drawing inspiration from evolutionary principles, the system maintains a growing lineage of agent variants, enabling ongoing exploration within the broad range of self-improving agent designs.

While current agent systems are typically static and unchanging after deployment, the DGM emphasizes continuous self-improvement as a crucial factor for advancing AI capabilities. The machine is designed to support AI systems that can learn and evolve their abilities over time, similarly to human development.

Our experiments demonstrate that the Darwin Gödel Machine can continuously self-improve by modifying its own codebase. On SWE-bench, DGM automatically improved its performance from 20% to 50%.

The figure here shows the performance progress over iterations, and also a summary of… pic.twitter.com/RjxapMTQN3

— Sakana AI (@SakanaAILabs) May 30, 2025

The DGM represents a notable advancement toward AI systems capable of autonomously identifying and building upon their own learning milestones to continually innovate. The system expands its archive by selecting an agent from its existing collection and employing a foundation model to generate a new, improved variant of that agent. This process of open-ended exploration creates a growing tree of diverse, high-quality agents, enabling simultaneous exploration of multiple pathways within the search space. 

Empirical results demonstrate that the DGM enhances its coding abilities over time—improving tools such as code editing, long-context management, and peer-review mechanisms—leading to increased performance on benchmarks like SWE-bench (from 20.0% to 50.0%) and Polyglot (from 14.2% to 30.7%). The system consistently outperforms baseline models that lack self-improvement or open-ended exploratory capabilities.

Notably, the evolution toward the most effective agent sometimes involved intermediate agents that performed worse than their predecessors but were retained in the lineage, illustrating the advantages of an open-ended search strategy. This approach preserves a diverse archive of useful intermediate agents rather than exclusively focusing on branching from the highest-performing agent, demonstrating that progress does not always follow a linear path.

The research further indicates that the improved performance of agents discovered by the DGM can be generalized across different foundation models, such as transferring from Claude to o3-mini, and across various programming languages and task domains, including Python, Rust, C++, Go, and others.

Sakana AI: Developing AI Systems Inspired By Nature And Collective Intelligence

Sakana AI is an AI research company based in Tokyo that focuses on developing AI systems inspired by natural processes. The company’s approach involves integrating multiple smaller, autonomous models to form a collective intelligence, similar to how a school of fish operates. This method differs from traditional large-scale AI models by prioritizing adaptability, resource efficiency, and long-term sustainability.

Among Sakana AI’s research projects is the “Evolutionary Model Merge” technique, which applies evolutionary algorithms to combine existing AI models. This process generates new models with targeted capabilities while minimizing the need for extensive computational power. Additionally, Sakana AI has developed the “AI Scientist,” a system designed to automate scientific research by allowing foundation models to independently carry out investigations and discovery processes.

The post Sakana AI Introduces Self-Improving Agent That Boosts Performance By Up To 50% On SWE-Bench appeared first on Metaverse Post.