More
    HomeAI NewsScienceBiggest-Ever AI Biology Model Writes DNA on Demand

    Biggest-Ever AI Biology Model Writes DNA on Demand

    An artificial-intelligence network trained on a vast trove of sequence data is a step toward designing completely new genomes

    • Unprecedented scale: Evo-2, the largest AI model for biology, was trained on 128,000 genomes spanning 9.3 trillion DNA letters, including humans, plants, and microbes.
    • Genome design and decoding: The model can generate entire chromosomes, interpret non-coding DNA linked to diseases, and predict harmful mutations with high accuracy.
    • Open innovation platform: Freely available to researchers, Evo-2 aims to spark an “app store for biology,” enabling customizable tools for synthetic biology and medicine.

    Scientists have unveiled Evo-2, a groundbreaking artificial intelligence model capable of designing DNA sequences and decoding genomic complexity with unprecedented precision. Developed by researchers at the Arc Institute, Stanford University, and tech giant NVIDIA, this tool represents a quantum leap in merging AI with biology—one that could redefine how we engineer life itself.

    A Leap in Scale and Complexity

    Evo-2’s power lies in its training data: 128,000 genomes from organisms as diverse as single-celled bacteria, plants, and humans. These 9.3 trillion DNA letters include both coding regions (which build proteins) and non-coding regions (which regulate gene activity)—a critical advancement over earlier protein-focused models like Meta’s ESM-3. Unlike prokaryotes (bacteria and archaea), eukaryotic genomes (found in humans, animals, and plants) are riddled with intricate regulatory DNA that can influence genes millions of base pairs away. Evo-2’s architecture handles this complexity by analyzing sequences up to 1 million base pairs in length, enabling it to grasp long-range genetic interactions.

    “This isn’t just about scale—it’s about understanding biology’s ‘dark matter,’” says Patrick Hsu, a bioengineer at the Arc Institute and co-developer of Evo-2. “Non-coding DNA has been a black box, but Evo-2 helps us decode its role in health and disease.”

    From DNA Writing to Disease Decoding

    Evo-2 isn’t just a passive observer—it’s a creative force. The model can generate synthetic chromosomes and small genomes from scratch, a feat that could accelerate efforts to engineer microbes for biofuels or design drought-resistant crops. It also excels at interpreting existing DNA, including mutations in genes like BRCA1, which is linked to breast cancer. In tests, Evo-2 predicted whether coding-region mutations would cause disease with accuracy rivaling specialized AI tools.

    But its standout feature is deciphering non-coding variants. These regions, once dismissed as “junk DNA,” are now known to regulate gene expression and contribute to conditions like autism and heart disease. By mapping their hidden logic, Evo-2 could unlock new therapies.

    Open Platform for Biological Innovation

    The team envisions Evo-2 as a foundation for global collaboration. Its code, data, and parameters are freely accessible, allowing researchers to build custom tools—a concept Hsu likens to an “app store for biology.” Early adopters might engineer viruses to target tumors, optimize gene therapies, or even resurrect extinct species. NVIDIA’s involvement also hints at future hardware integrations, potentially accelerating AI-driven drug discovery.

    Yet experts urge cautious optimism. “The engineering behind Evo-2 is impressive, but independent validation is key,” says Stanford computational genomicist Anshul Kundaje. “We need benchmarks to see how it performs in real-world scenarios.”

    Challenges and the Road Ahead

    Despite its promise, Evo-2 faces hurdles. Generating entire human genomes remains a distant goal, and ethical questions loom: Who governs AI-designed life? Could synthetic organisms escape labs? The developers emphasize transparency, noting that Evo-2’s open-source nature allows scrutiny.

    As the preprint awaits peer review, the scientific community is eager to test its limits. For now, Evo-2 stands as a milestone—a tool that bridges synthetic biology’s ambitions with AI’s raw computational might. As Hsu puts it, “We’re not just reading nature’s code anymore. We’re learning to rewrite it.”

    Must Read