The platform #Google DeepMind announced AlphaGenome—a model of artificial intelligence that can finally decode the mysterious 'junk DNA.' That very part of the genome that makes up over 90% of our hereditary information but has long been considered useless junk.
It turns out this 'junk' controls the whole show.
When trash becomes treasure
For a long time, scientists focused only on those parts of DNA that directly code for proteins—the building blocks of our body. This is understandable: reading assembly instructions is easier than figuring out the control panel. But all the rest of the DNA—those very 90%—functions precisely as a control panel. It decides when to turn genes on or off, where, and in what quantity.
The problem is that this control panel is written in a language we didn't fully understand. Until now.
AlphaGenome is the first model #AI capable of processing DNA segments up to a million base pairs long simultaneously. In comparison, previous models worked with short fragments, as if trying to understand a symphony by listening to individual notes.
Architecture that changes the game
From a technical standpoint, AlphaGenome is built on U-Net architecture with a transformer and contains 'only' 450 million parameters. Yes, that's humorously small compared to language models that operate with billions of parameters. But keep in mind: DNA operates with just four bases—A, T, C, G. The entire human genome is 3 billion pairs of these letters. The model is tailored for one specific task and performs it brilliantly.
The system operates as a multi-level translator: it first encodes the DNA sequence, then the transformer analyzes distant connections between regions, and the decoder reconstructs the result back to the level of individual bases. This allows for predictions with varying resolutions—from detailed analysis of individual mutations to an overall picture of gene regulation.
Results that impress
AlphaGenome outperformed existing models in 46 out of 50 tests for predicting regulatory functions and the impact of genetic variants. Such 'clean victories' in bioinformatics are rare—usually improvements are measured in percentages.
The model can predict how a mutation will affect gene function in just a few seconds. Previously, such analysis required weeks of laboratory experiments. Moreover, it can model gene expression, splicing events, chromatin states, and even the three-dimensional structure of the genome.
Training the model took only four hours on special Google TPU processors, using half the computing resources of its predecessor Enformer. Meanwhile, AlphaGenome was trained on a vast array of public data—ENCODE, GTEx, 4D Nucleome, and FANTOM5—which includes thousands of experimental profiles of various types of human and mouse cells.
Accessibility changes everything
Most importantly—Google DeepMind has made AlphaGenome available to researchers through an API for non-commercial research. The company has also provided extensive documentation and community support on GitHub. This radically changes the situation in genomics, which has long been closed off in specialized laboratories with costly databases.
Yes, the model is not yet fully open—researchers cannot download and run it locally. But the API and accompanying resources on GitHub allow scientists worldwide to generate predictions and adapt analyses for different species or cell types. DeepMind promises a broader open release in the future.
What this means for medicine
The ability to analyze non-coding DNA variants—where most mutations associated with diseases lie—opens new horizons in understanding genetic disorders and rare diseases. High-speed variant analysis also supports personalized medicine, where treatment is tailored to the unique genetic profile of the patient.
AlphaGenome is part of a growing ecosystem of AI tools for biology. The Ankh model from the universities of Munich and Columbia processes protein sequences as a language, creating new proteins. GenSLMs from Nvidia predicts viral mutations for pandemic research. AI is already helping in the development of chemical and genetic methods to combat aging.
Non-coding genome is no longer a black box, and AI's role in genomics will only expand. AlphaGenome may not lead us to Huxley's 'brave new world,' but it clearly points the way: more data, better predictions, and a deeper understanding of the mechanisms of life.