allenai
/

OLMo-2-0425-1B

@@ -78,7 +78,7 @@ We have released checkpoints for these models. For pretraining, the naming conve
 To load a specific model revision with HuggingFace, simply add the argument `revision`:
 ```bash
-olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0425-1B", revision="step250000-tokens2098B")
 ```
 Or, you can access all the revisions for the models via the following code snippet:
@@ -89,7 +89,19 @@ branches = [b.name for b in out.branches]
 ```
 ### Fine-tuning
-TODO
 ### Model Description
@@ -123,7 +135,7 @@ TODO
 |-------------------|------------|------------|------------|------------|
 | Pretraining Stage 1 | 4 trillion tokens<br>(1 epoch) | 4 trillion tokens<br>(1 epoch) | 5 trillion tokens<br>(1.2 epochs) | 6 trillion tokens<br>(1.5 epochs) |
 | Pretraining Stage 2 | 50B tokens (3 runs)<br>*merged* | 50B tokens (3 runs)<br>*merged* | 100B tokens (3 runs)<br>300B tokens (1 run)<br>*merged* | 100B tokens (3 runs)<br>300B tokens (1 run)<br>*merged* |
-| Post-training | SFT + DPO + GRPO<br>([preference mix](#)) | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-1124-7b-preference-mix)) | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-1124-13b-preference-mix)) | SFT + DPO + GRPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-32b-pref-mix-v1)) |
 #### Stage 1: Initial Pretraining
 - Dataset: [OLMo-mix-1124](https://huggingface.co/datasets/allenai/olmo-mix-1124) (3.9T tokens)

 To load a specific model revision with HuggingFace, simply add the argument `revision`:
 ```bash
+olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0425-1B", revision="stage1-step140000-tokens294B")
 ```
 Or, you can access all the revisions for the models via the following code snippet:
 ```
 ### Fine-tuning
+Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
+1. Fine-tune with the OLMo repository:
+```bash
+torchrun --nproc_per_node=8 scripts/train.py {path_to_train_config} \
+    --data.paths=[{path_to_data}/input_ids.npy] \
+    --data.label_mask_paths=[{path_to_data}/label_mask.npy] \
+    --load_path={path_to_checkpoint} \
+    --reset_trainer_state
+```
+For more documentation, see the [GitHub README](https://github.com/allenai/OLMo/).
+2. Further fine-tuning support is being developing in AI2's Open Instruct repository. Details are [here](https://github.com/allenai/open-instruct).
 ### Model Description
 |-------------------|------------|------------|------------|------------|
 | Pretraining Stage 1 | 4 trillion tokens<br>(1 epoch) | 4 trillion tokens<br>(1 epoch) | 5 trillion tokens<br>(1.2 epochs) | 6 trillion tokens<br>(1.5 epochs) |
 | Pretraining Stage 2 | 50B tokens (3 runs)<br>*merged* | 50B tokens (3 runs)<br>*merged* | 100B tokens (3 runs)<br>300B tokens (1 run)<br>*merged* | 100B tokens (3 runs)<br>300B tokens (1 run)<br>*merged* |
+| Post-training | SFT + DPO + GRPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-0425-1b-preference-mix)) | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-1124-7b-preference-mix)) | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-1124-13b-preference-mix)) | SFT + DPO + GRPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-32b-pref-mix-v1)) |
 #### Stage 1: Initial Pretraining
 - Dataset: [OLMo-mix-1124](https://huggingface.co/datasets/allenai/olmo-mix-1124) (3.9T tokens)