stephenchungmh
/

thinker_q1_5b

Model card Files Files and versions

thinker_q1_5b / README.md

stephenchungmh's picture

Update README.md

52f52c1 verified 23 days ago

|

history blame contribute delete

304 Bytes

	---
	license: mit
	base_model:
	- Qwen/Qwen2.5-1.5B
	tags:
	- RL
	- Math
	---
	This is the trained Thinker-Q1.5B model from the paper [Thinker: Learning to Think Fast and Slow](https://arxiv.org/abs/2505.21097). Please refer to the [GitHub repo](https://github.com/stephen-chung-mh/thinker-task) for details.