Evo 2 (1B Base) - Hugging Face Transformers Format
This repository contains the Evo 2 (1B Base) model, converted to the Hugging Face Transformers format.
Original Repository: arcinstitute/evo2_1b_base
Paper: Genome modeling and design across all domains of life with Evo 2
Authors: Garyk Brixi, Matthew G. Durrant, Jerome Ku, Michael Poli, et al.
Model Description
Evo 2 is a biological foundation model trained on 9.3 trillion DNA base pairs from a curated genomic atlas spanning all domains of life. It uses the StripedHyena architecture to process long sequences (up to 1 million base pairs) at nucleotide-level resolution. This model is designed for tasks such as predicting the functional effects of mutations and generating novel genomic sequences.
This version has been converted to be compatible with the transformers library, allowing for easy loading and inference.
Usage
You can load and run this model using the transformers library as follows:
import torch
from transformers import Evo2ForCausalLM, Evo2Tokenizer
# Replace with your local path or the Hub repo ID after uploading
model_path = "path/to/this/repo"
print(f"Loading model from {model_path}...")
model = Evo2ForCausalLM.from_pretrained(model_path)
tokenizer = Evo2Tokenizer.from_pretrained(model_path)
# Move to GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)
# Input sequence (DNA)
sequence = "ACGTACGT"
print(f"Input: {sequence}")
# Tokenize
input_ids = tokenizer.encode(sequence, return_tensors="pt").to(device)
# Generate
print("Generating...")
with torch.no_grad():
output = model.generate(input_ids, max_new_tokens=20)
# Decode
generated_sequence = tokenizer.decode(output[0])
print(f"Output: {generated_sequence}")
Citation
If you use this model, please cite the original paper:
@article{brixi2024genome,
title={Genome modeling and design across all domains of life with Evo 2},
author={Brixi, Garyk and Durrant, Matthew G and Ku, Jerome and Poli, Michael and others},
journal={bioRxiv},
year={2024},
publisher={Cold Spring Harbor Laboratory}
}
- Downloads last month
- 22