SpaceThinker
Collection
Test Time Compute for Quantitative Spatial Reasoning using synthetic reasoning traces from 3D scene graphs
•
7 items
•
Updated
•
2
Finetuned Qwen3-VL-2B-Thinking by Low-Rank Adapters using the SpaceOm dataset created with VQASynth, an open-source multimodal data synthesis pipeline inspired by SpatialVLM
@misc{qwen3technicalreport,
title={Qwen3 Technical Report},
author={Qwen Team},
year={2025},
eprint={2505.09388},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.09388},
}
@article{chen2024spatialvlm,
title = {SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities},
author = {Chen, Boyuan and Xu, Zhuo and Kirmani, Sean and Ichter, Brian and Driess, Danny and Florence, Pete and Sadigh, Dorsa and Guibas, Leonidas and Xia, Fei},
journal = {arXiv preprint arXiv:2401.12168},
year = {2024},
url = {https://arxiv.org/abs/2401.12168},
}
Base model
Qwen/Qwen3-VL-2B-Thinking