TR2-D2 For Enhancer DNA Design
This part of the code is for finetuning DNA sequence models for optimizing DNA enhancer activity with TR2-D2.
The codebase is partially built upon PepTune (Tang et.al, 2024), MDLM (Sahoo et.al, 2023), Drakes (Wang et.al, 2024), and MDNS (Zhu et.al, 2025).
Environment Installation
conda create -n tr2d2-dna python=3.9.18
conda activate tr2d2-dna
bash env.sh
Model Pretrained Weights Download
All data and model weights can be downloaded from the link below, which is provided by the DRAKES author. Save the downloaded file in $BASE_PATH.
For downloading using terminal, use
curl -L -o dna.zip "https://www.dropbox.com/scl/fi/zi6egfppp0o78gr0tmbb1/DRAKES_data.zip?rlkey=yf7w0pm64tlypwsewqc01wmfq&st=xe8dzn8k&dl=0"
unzip dna.zip
Finetune with TR2-D2
After downloading the pretrained checkpoints, fill in the base_path in dataloader_gosai.py, oracle.py, and finetune.sh. Fill in HOME_LOC and SAVE_PATH in finetune.sh as well.
Reproduce the DNA experiments with $\alpha = 0.1$ using
sbatch train.sh
Evaluate saved checkpoints
The checkpoints will be saved to SAVE_PATH.
Fill in RUNS_DIR in run_batch_eval.sh to be the same as SAVE_PATH. The checkpoints can be evaluated with
sbatch run_batch_eval.sh