Update README.md
Browse files
README.md
CHANGED
|
@@ -43,14 +43,11 @@ license_link: LICENSE
|
|
| 43 |
</a>
|
| 44 |
</div>
|
| 45 |
<p align="center">
|
| 46 |
-
<a href="#
|
| 47 |
<a href="#3-model-downloads">Model Download</a> |
|
| 48 |
-
<a href="#4-
|
| 49 |
-
<a href="#5-
|
| 50 |
-
<a href="#6-
|
| 51 |
-
<a href="#7-license">License</a> |
|
| 52 |
-
<a href="#8-citation">Citation</a> |
|
| 53 |
-
<a href="#9-contact">Contact</a>
|
| 54 |
</p>
|
| 55 |
|
| 56 |
|
|
@@ -66,7 +63,7 @@ license_link: LICENSE
|
|
| 66 |
We introduce DeepSeek-Prover-V1.5, an open-source language model designed for theorem proving in Lean 4, which enhances DeepSeek-Prover-V1 by optimizing both training and inference processes. Pre-trained on DeepSeekMath-Base with specialization in formal mathematical languages, the model undergoes supervised fine-tuning using an enhanced formal theorem proving dataset derived from DeepSeek-Prover-V1. Further refinement is achieved through reinforcement learning from proof assistant feedback (RLPAF). Beyond the single-pass whole-proof generation approach of DeepSeek-Prover-V1, we propose RMaxTS, a variant of Monte-Carlo tree search that employs an intrinsic-reward-driven exploration strategy to generate diverse proof paths. DeepSeek-Prover-V1.5 demonstrates significant improvements over DeepSeek-Prover-V1, achieving new state-of-the-art results on the test set of the high school level miniF2F benchmark (63.5%) and the undergraduate level ProofNet benchmark (25.3%).
|
| 67 |
|
| 68 |
<p align="center">
|
| 69 |
-
<img width="100%" src="figures/performance.png">
|
| 70 |
</p>
|
| 71 |
|
| 72 |
|
|
@@ -102,64 +99,12 @@ We release the DeepSeek-Prover-V1.5 with 7B parameters, including base, SFT and
|
|
| 102 |
|
| 103 |
</div>
|
| 104 |
|
| 105 |
-
## 4.
|
| 106 |
-
|
| 107 |
-
### Requirements
|
| 108 |
-
|
| 109 |
-
* Supported platform: Linux
|
| 110 |
-
* Python 3.10
|
| 111 |
-
|
| 112 |
-
### Installation
|
| 113 |
-
|
| 114 |
-
1. **Install Lean 4**
|
| 115 |
-
|
| 116 |
-
Follow the instructions on the [Lean 4 installation page](https://leanprover.github.io/lean4/doc/quickstart.html) to set up Lean 4.
|
| 117 |
-
|
| 118 |
-
2. **Clone the repository**
|
| 119 |
-
|
| 120 |
-
```sh
|
| 121 |
-
git clone --recurse-submodules [email protected]:deepseek-ai/DeepSeek-Prover-V1.5.git
|
| 122 |
-
cd DeepSeek-Prover-V1.5
|
| 123 |
-
```
|
| 124 |
-
|
| 125 |
-
3. **Install Dependencies**
|
| 126 |
-
|
| 127 |
-
```sh
|
| 128 |
-
pip install -r requirements.txt
|
| 129 |
-
```
|
| 130 |
-
|
| 131 |
-
4. **Build Mathlib4**
|
| 132 |
-
|
| 133 |
-
```sh
|
| 134 |
-
cd mathlib4
|
| 135 |
-
lake build
|
| 136 |
-
```
|
| 137 |
-
|
| 138 |
-
## 5. Quick Start
|
| 139 |
-
|
| 140 |
-
You can directly use [Huggingface's Transformers](https://github.com/huggingface/transformers) for model inference. A simple example of generating a proof for a problem from miniF2F and verifying it can be found in [quick_start.py](https://github.com/deepseek-ai/DeepSeek-Prover-V1.5/blob/master/quick_start.py).
|
| 141 |
-
|
| 142 |
-
To run paper experiments, you can use the following script to launch a RMaxTS proof search agent:
|
| 143 |
-
```sh
|
| 144 |
-
python -m prover.launch --config=configs/RMaxTS.py --log_dir=logs/RMaxTS_results
|
| 145 |
-
```
|
| 146 |
-
|
| 147 |
-
You can use `CUDA_VISIBLE_DEVICES=0,1,路路路` to specify the GPU devices. The experiment results can be gathered using the following script:
|
| 148 |
-
```sh
|
| 149 |
-
python -m prover.summarize --config=configs/RMaxTS.py --log_dir=logs/RMaxTS_results
|
| 150 |
-
```
|
| 151 |
-
|
| 152 |
-
## 6. Questions and Bugs
|
| 153 |
-
|
| 154 |
-
* For general questions and discussions, please use [GitHub Discussions](https://github.com/deepseek-ai/DeepSeek-Prover-V1.5/discussions).
|
| 155 |
-
* To report a potential bug, please open an issue.
|
| 156 |
-
|
| 157 |
-
## 7. License
|
| 158 |
This code repository is licensed under the MIT License. The use of DeepSeekMath models is subject to the Model License. DeepSeekMath supports commercial use.
|
| 159 |
|
| 160 |
See the [LICENSE-CODE](LICENSE-CODE) and [LICENSE-MODEL](LICENSE-MODEL) for more details.
|
| 161 |
|
| 162 |
-
##
|
| 163 |
```latex
|
| 164 |
@article{xin2024deepseekproverv15harnessingproofassistant,
|
| 165 |
title={DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search},
|
|
@@ -172,6 +117,6 @@ See the [LICENSE-CODE](LICENSE-CODE) and [LICENSE-MODEL](LICENSE-MODEL) for more
|
|
| 172 |
}
|
| 173 |
```
|
| 174 |
|
| 175 |
-
##
|
| 176 |
|
| 177 |
If you have any questions, please raise an issue or contact us at [[email protected]](mailto:[email protected]).
|
|
|
|
| 43 |
</a>
|
| 44 |
</div>
|
| 45 |
<p align="center">
|
| 46 |
+
<a href="#2-evaluation-results">Evaluation Results</a> |
|
| 47 |
<a href="#3-model-downloads">Model Download</a> |
|
| 48 |
+
<a href="#4-license">License</a> |
|
| 49 |
+
<a href="#5-citation">Citation</a> |
|
| 50 |
+
<a href="#6-contact">Contact</a>
|
|
|
|
|
|
|
|
|
|
| 51 |
</p>
|
| 52 |
|
| 53 |
|
|
|
|
| 63 |
We introduce DeepSeek-Prover-V1.5, an open-source language model designed for theorem proving in Lean 4, which enhances DeepSeek-Prover-V1 by optimizing both training and inference processes. Pre-trained on DeepSeekMath-Base with specialization in formal mathematical languages, the model undergoes supervised fine-tuning using an enhanced formal theorem proving dataset derived from DeepSeek-Prover-V1. Further refinement is achieved through reinforcement learning from proof assistant feedback (RLPAF). Beyond the single-pass whole-proof generation approach of DeepSeek-Prover-V1, we propose RMaxTS, a variant of Monte-Carlo tree search that employs an intrinsic-reward-driven exploration strategy to generate diverse proof paths. DeepSeek-Prover-V1.5 demonstrates significant improvements over DeepSeek-Prover-V1, achieving new state-of-the-art results on the test set of the high school level miniF2F benchmark (63.5%) and the undergraduate level ProofNet benchmark (25.3%).
|
| 64 |
|
| 65 |
<p align="center">
|
| 66 |
+
<img width="100%" src="https://github.com/deepseek-ai/DeepSeek-Prover-V1.5/blob/main/figures/performance.png?raw=true">
|
| 67 |
</p>
|
| 68 |
|
| 69 |
|
|
|
|
| 99 |
|
| 100 |
</div>
|
| 101 |
|
| 102 |
+
## 4. License
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
This code repository is licensed under the MIT License. The use of DeepSeekMath models is subject to the Model License. DeepSeekMath supports commercial use.
|
| 104 |
|
| 105 |
See the [LICENSE-CODE](LICENSE-CODE) and [LICENSE-MODEL](LICENSE-MODEL) for more details.
|
| 106 |
|
| 107 |
+
## 5. Citation
|
| 108 |
```latex
|
| 109 |
@article{xin2024deepseekproverv15harnessingproofassistant,
|
| 110 |
title={DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search},
|
|
|
|
| 117 |
}
|
| 118 |
```
|
| 119 |
|
| 120 |
+
## 6. Contact
|
| 121 |
|
| 122 |
If you have any questions, please raise an issue or contact us at [[email protected]](mailto:[email protected]).
|