Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -119,23 +119,28 @@ DeepMath-1.5B is created by finetuning deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B 
     | 
|
| 119 | 
         | 
| 120 | 
         
             
            <sub>Difficulty distribution comparison.</sub> </div>
         
     | 
| 121 | 
         | 
| 122 | 
         
            -
            **2.  
     | 
| 123 | 
         | 
| 124 | 
         
             
            <div align="center"> <img src="./assets/github-domain.png" width="50%"/>
         
     | 
| 125 | 
         | 
| 126 | 
         
             
            <sub>Hierarchical breakdown of mathematical topics covered in DeepMath-103K.</sub></div>
         
     | 
| 127 | 
         | 
| 128 | 
         
            -
             
     | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 129 | 
         | 
| 130 | 
         
             
            <div align="center"> <img src="./assets/github-contamination-case.png" width="80%"/>
         
     | 
| 131 | 
         | 
| 132 | 
         
             
            <sub>Detected contamination examples. Subtle conceptual overlaps can also be identified.</sub> </div>
         
     | 
| 133 | 
         | 
| 134 | 
         
            -
            ** 
     | 
| 135 | 
         | 
| 136 | 
         
             
            <div align="center"> <img src="./assets/github-data-sample.png" width="90%"/>
         
     | 
| 137 | 
         | 
| 138 | 
         
            -
            <sub> 
     | 
| 139 | 
         | 
| 140 | 
         
             
            - **Question**: The mathematical problem statement.
         
     | 
| 141 | 
         
             
            - **Final Answer**: A reliably verifiable final answer, enabling robust rule-based reward functions for RL.
         
     | 
| 
         @@ -145,22 +150,73 @@ DeepMath-1.5B is created by finetuning deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B 
     | 
|
| 145 | 
         | 
| 146 | 
         
             
            ## 📊Main Results
         
     | 
| 147 | 
         | 
| 148 | 
         
            -
             
     | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 149 | 
         | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 150 | 
         | 
| 151 | 
         
            -
            |          Model           | MATH 500 |  AMC23   | Olympiad Bench | Minerva Math |  AIME24  |  AIME25  |
         
     | 
| 152 | 
         
            -
            | :----------------------: | :------: | :------: | :------------: | :----------: | :------: | :------: |
         
     | 
| 153 | 
         
            -
            |     Qwen2.5-7B-Base      |   54.8   |   35.3   |      27.8      |     16.2     |   7.7    |   5.4    |
         
     | 
| 154 | 
         
            -
            |  Open-Reasoner-Zero-7B   |   81.8   |   58.9   |      47.9      |     38.4     |   15.6   |   14.4   |
         
     | 
| 155 | 
         
            -
            | Qwen-2.5-7B-SimpleRL-Zoo |   77.0   |   55.8   |      41.0      |     41.2     |   15.6   |   8.7    |
         
     | 
| 156 | 
         
            -
            |     [DeepMath-Zero-7B](https://huggingface.co/zwhe99/DeepMath-Zero-7B)     | **85.5** | **64.7** |    **51.0**    |   **45.3**   | **20.4** | **17.5** |
         
     | 
| 157 | 
         | 
| 158 | 
         
            -
            |          Model          | MATH 500 |  AMC23   | Olympiad Bench | Minerva Math |  AIME24  |  AIME25  |
         
     | 
| 159 | 
         
            -
            | :---------------------: | :------: | :------: | :------------: | :----------: | :------: | :------: |
         
     | 
| 160 | 
         
            -
            |  R1-Distill-Qwen-1.5B   |   84.7   |   72.0   |      53.1      |     36.6     |   29.4   |   24.8   |
         
     | 
| 161 | 
         
            -
            | DeepScaleR-1.5B-Preview |   89.4   |   80.3   |      60.9      |     42.2     | **42.3** |   29.6   |
         
     | 
| 162 | 
         
            -
            |  Still-3-1.5B-Preview   |   86.6   |   75.8   |      55.7      |     38.7     |   30.8   |   24.6   |
         
     | 
| 163 | 
         
            -
            |   [DeepMath-1.5B](https://huggingface.co/zwhe99/DeepMath-1.5B)         | **89.9** | **82.3** |    **61.8**    |   **42.5**   |   37.3   | **30.8** |
         
     | 
| 164 | 
         | 
| 165 | 
         
             
            ## 🙏 Acknowledgements
         
     | 
| 166 | 
         | 
| 
         @@ -171,6 +227,8 @@ This work can not be done without the help of the following works: 
     | 
|
| 171 | 
         
             
            - **[TIGER-Lab/WebInstructSub](https://huggingface.co/datasets/TIGER-Lab/WebInstructSub)**: Instruction data from MathStackExchange and ScienceStackExchange.
         
     | 
| 172 | 
         
             
            - **[AI-MO/NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT)**: Approximately 860k math problems.
         
     | 
| 173 | 
         | 
| 
         | 
|
| 
         | 
|
| 174 | 
         
             
            ## 📚 Citation
         
     | 
| 175 | 
         
             
            ```bibtex
         
     | 
| 176 | 
         
             
            @article{deepmath,
         
     | 
| 
         @@ -182,4 +240,4 @@ This work can not be done without the help of the following works: 
     | 
|
| 182 | 
         
             
              primaryClass={cs.CL},
         
     | 
| 183 | 
         
             
              url={https://arxiv.org/abs/2504.11456}, 
         
     | 
| 184 | 
         
             
            }
         
     | 
| 185 | 
         
            -
            ```
         
     | 
| 
         | 
|
| 119 | 
         | 
| 120 | 
         
             
            <sub>Difficulty distribution comparison.</sub> </div>
         
     | 
| 121 | 
         | 
| 122 | 
         
            +
            **2. Data Diversity and Novelty**: DeepMath-103K spans a wide spectrum of mathematical subjects, including Algebra, Calculus, Number Theory, Geometry, Probability, and Discrete Mathematics.
         
     | 
| 123 | 
         | 
| 124 | 
         
             
            <div align="center"> <img src="./assets/github-domain.png" width="50%"/>
         
     | 
| 125 | 
         | 
| 126 | 
         
             
            <sub>Hierarchical breakdown of mathematical topics covered in DeepMath-103K.</sub></div>
         
     | 
| 127 | 
         | 
| 128 | 
         
            +
            The problems in DeepMath-103K are novel and unique, whereas many existing datasets are similar and overlap.
         
     | 
| 129 | 
         
            +
            <div align="center"> <img src="./assets/github-tsne.png" width="70%"/>
         
     | 
| 130 | 
         
            +
             
     | 
| 131 | 
         
            +
            <sub>Embedding distributions of different datasets.</sub></div>
         
     | 
| 132 | 
         
            +
             
     | 
| 133 | 
         
            +
            **3. Rigorous Decontamination**: Built from diverse sources, DeepMath-103K underwent meticulous decontamination against common benchmarks using semantic matching. This minimizes test set leakage and promotes fair model evaluation.
         
     | 
| 134 | 
         | 
| 135 | 
         
             
            <div align="center"> <img src="./assets/github-contamination-case.png" width="80%"/>
         
     | 
| 136 | 
         | 
| 137 | 
         
             
            <sub>Detected contamination examples. Subtle conceptual overlaps can also be identified.</sub> </div>
         
     | 
| 138 | 
         | 
| 139 | 
         
            +
            **4. Rich Data Format**: Each sample in DeepMath-103K is structured with rich information to support various research applications:
         
     | 
| 140 | 
         | 
| 141 | 
         
             
            <div align="center"> <img src="./assets/github-data-sample.png" width="90%"/>
         
     | 
| 142 | 
         | 
| 143 | 
         
            +
            <sub>An example data sample from DeepMath-103K.</sub> </div>
         
     | 
| 144 | 
         | 
| 145 | 
         
             
            - **Question**: The mathematical problem statement.
         
     | 
| 146 | 
         
             
            - **Final Answer**: A reliably verifiable final answer, enabling robust rule-based reward functions for RL.
         
     | 
| 
         | 
|
| 150 | 
         | 
| 151 | 
         
             
            ## 📊Main Results
         
     | 
| 152 | 
         | 
| 153 | 
         
            +
            DeepMath serise models achieve many **SOTA** results on challenging math benchmarks:
         
     | 
| 154 | 
         
            +
             
     | 
| 155 | 
         
            +
            <div align="center"> <img src="./assets/github-main.png" width="90%"/>
         
     | 
| 156 | 
         
            +
             
     | 
| 157 | 
         
            +
            <sub>Math reasoning performance.</sub> </div>
         
     | 
| 158 | 
         
            +
             
     | 
| 159 | 
         
            +
             
     | 
| 160 | 
         
            +
            ## 🎯Quick Start
         
     | 
| 161 | 
         
            +
             
     | 
| 162 | 
         
            +
            #### Environment Preparation
         
     | 
| 163 | 
         
            +
             
     | 
| 164 | 
         
            +
            ```shell
         
     | 
| 165 | 
         
            +
            git clone --recurse-submodules https://github.com/zwhe99/DeepMath.git && cd DeepMath
         
     | 
| 166 | 
         
            +
             
     | 
| 167 | 
         
            +
            conda create -y -n deepmath python=3.12.2 && conda activate deepmath
         
     | 
| 168 | 
         
            +
            pip3 install ray[default]
         
     | 
| 169 | 
         
            +
            pip3 install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
         
     | 
| 170 | 
         
            +
            pip3 install flash-attn==2.7.4.post1 --no-build-isolation
         
     | 
| 171 | 
         
            +
            pip3 install omegaconf==2.4.0.dev3 hydra-core==1.4.0.dev1 antlr4-python3-runtime==4.11.0 vllm==0.7.3
         
     | 
| 172 | 
         
            +
            pip3 install math-verify[antlr4_11_0]==0.7.0 fire deepspeed tensorboardX prettytable datasets transformers==4.49.0
         
     | 
| 173 | 
         
            +
            pip3 install -e verl
         
     | 
| 174 | 
         
            +
            ```
         
     | 
| 175 | 
         
            +
             
     | 
| 176 | 
         
            +
             
     | 
| 177 | 
         | 
| 178 | 
         
            +
            #### Evaluation
         
     | 
| 179 | 
         
            +
             
     | 
| 180 | 
         
            +
            ```shell
         
     | 
| 181 | 
         
            +
            VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 VLLM_ATTENTION_BACKEND=XFORMERS VLLM_USE_V1=1 VLLM_WORKER_MULTIPROC_METHOD=spawn python3 uni_eval.py \
         
     | 
| 182 | 
         
            +
                --base_model zwhe99/DeepMath-Zero-7B \
         
     | 
| 183 | 
         
            +
                --chat_template_name orz \
         
     | 
| 184 | 
         
            +
                --system_prompt_name simplerl \
         
     | 
| 185 | 
         
            +
                --output_dir  \
         
     | 
| 186 | 
         
            +
                --bf16 True \
         
     | 
| 187 | 
         
            +
                --tensor_parallel_size 8 \
         
     | 
| 188 | 
         
            +
                --data_id zwhe99/MATH \
         
     | 
| 189 | 
         
            +
                --split math500 \
         
     | 
| 190 | 
         
            +
                --max_model_len 32768 \
         
     | 
| 191 | 
         
            +
                --temperature 0.6 \
         
     | 
| 192 | 
         
            +
                --top_p 0.95 \
         
     | 
| 193 | 
         
            +
                --n 16
         
     | 
| 194 | 
         
            +
            ```
         
     | 
| 195 | 
         
            +
             
     | 
| 196 | 
         
            +
             
     | 
| 197 | 
         
            +
             
     | 
| 198 | 
         
            +
            #### Training
         
     | 
| 199 | 
         
            +
             
     | 
| 200 | 
         
            +
            * Data Preparation
         
     | 
| 201 | 
         
            +
             
     | 
| 202 | 
         
            +
              ```shell
         
     | 
| 203 | 
         
            +
              DATA_DIR=/path/to/your/data
         
     | 
| 204 | 
         
            +
              python3 verl/examples/data_preprocess/deepmath_103k.py --local_dir $DATA_DIR
         
     | 
| 205 | 
         
            +
              ```
         
     | 
| 206 | 
         
            +
             
     | 
| 207 | 
         
            +
            * Start Ray
         
     | 
| 208 | 
         
            +
             
     | 
| 209 | 
         
            +
              ```shell
         
     | 
| 210 | 
         
            +
              # Head node (×1)
         
     | 
| 211 | 
         
            +
              ray start  --head --port=6379  --node-ip-address=$HEAD_ADDR --num-gpus=8
         
     | 
| 212 | 
         
            +
              
         
     | 
| 213 | 
         
            +
              # Worker nodes (×7 or ×11)
         
     | 
| 214 | 
         
            +
              ray start  --address=$HEAD_ADDR:6379 --node-ip-address=$WORKER_ADDR --num-gpus=8
         
     | 
| 215 | 
         
            +
              ```
         
     | 
| 216 | 
         
            +
             
     | 
| 217 | 
         
            +
            * Launch training at head node. See `scripts/train` for training scripts.
         
     | 
| 218 | 
         | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 219 | 
         | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 220 | 
         | 
| 221 | 
         
             
            ## 🙏 Acknowledgements
         
     | 
| 222 | 
         | 
| 
         | 
|
| 227 | 
         
             
            - **[TIGER-Lab/WebInstructSub](https://huggingface.co/datasets/TIGER-Lab/WebInstructSub)**: Instruction data from MathStackExchange and ScienceStackExchange.
         
     | 
| 228 | 
         
             
            - **[AI-MO/NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT)**: Approximately 860k math problems.
         
     | 
| 229 | 
         | 
| 230 | 
         
            +
             
     | 
| 231 | 
         
            +
             
     | 
| 232 | 
         
             
            ## 📚 Citation
         
     | 
| 233 | 
         
             
            ```bibtex
         
     | 
| 234 | 
         
             
            @article{deepmath,
         
     | 
| 
         | 
|
| 240 | 
         
             
              primaryClass={cs.CL},
         
     | 
| 241 | 
         
             
              url={https://arxiv.org/abs/2504.11456}, 
         
     | 
| 242 | 
         
             
            }
         
     | 
| 243 | 
         
            +
            ```
         
     |