Evo 2: DNA language model for genome modeling and design
Foundation model for DNA sequence generation and scoring.
Learn more about evo2
Evo 2 is a foundation model for DNA sequence analysis that uses deep learning architectures to generate and score genomic sequences at scale. The model is available in multiple parameter sizes including a 7 billion parameter variant, and operates by learning probabilistic distributions over nucleotide sequences to enable both generative and discriminative tasks on DNA. It implements sampling strategies including temperature scaling, top-k filtering, and nucleus sampling to control the stochasticity and diversity of generated sequences. The model supports mixed-precision computation with bfloat16 tensors for memory efficiency and can be deployed on both GPU and CPU hardware depending on computational requirements. Generated outputs maintain biologically relevant properties such as GC content ratios while allowing for variant generation through configurable sampling parameters.
Extended context length
Supports up to 1 million base pair context windows, enabling analysis of large genomic regions that exceed typical sequence model capabilities.
StripedHyena 2 architecture
Built on the StripedHyena 2 architecture rather than standard transformer designs, providing alternative computational characteristics for long-context DNA modeling.
Multi-domain training data
Trained on 8.8 trillion tokens from OpenGenome2, covering sequences from all domains of life rather than limited organism sets, providing broader genomic representation.
from evo2 import Evo2
model = Evo2(model_name="evo2_7b")
prompt = "ATCGATCGATCG"
# Generate continuation of DNA sequence
generated = model.generate(prompt, max_length=100, temperature=0.8)
print(f"Generated sequence: {generated}")See how people are using evo2
Related Repositories
Discover similar tools and frameworks used by developers
sglang
High-performance inference engine for LLMs and VLMs.
llama
PyTorch inference for Meta's Llama language models.
pytorch
Python framework for differentiable tensor computation and deep learning.
open_clip
PyTorch library for contrastive language-image pretraining.
stable-diffusion
CLIP-conditioned latent diffusion model for text-to-image synthesis.