Yysrc commited on
Commit
41b0026
·
verified ·
1 Parent(s): 9f2dcb0

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: robotics
4
+ library_name: transformers
5
+ ---
6
+ # Mantis
7
+
8
+ > This is the official checkpoint of **Mantis: A Versatile Vision-Language-Action Model
9
+ with Disentangled Visual Foresight**
10
+
11
+ - **Paper:** https://arxiv.org/pdf/2511.16175
12
+ - **Code:** https://github.com/zhijie-group/Mantis
13
+
14
+ ### 🔥 Highlights
15
+ - **Disentangled Visual Foresight** augments action learning without overburdening the backbone.
16
+ - **Progressive Training** preserves the understanding capabilities of the backbone.
17
+ - **Adaptive Temporal Ensemble** reduces inference cost while maintaining stable control.
18
+
19
+ ### How to use
20
+ This is the Mantis model pretrained on the [LIBERO](https://huggingface.co/datasets/Yysrc/mantis_libero_lerobot/tree/main) spatial dataset. For detailed usage please refer to [our repository](https://github.com/zhijie-group/Mantis).
21
+
22
+ ### 📝 Citation
23
+ If you find our code or models useful in your work, please cite [our paper](https://arxiv.org/pdf/2511.16175):
24
+ ```
25
+ @article{yang2025mantis,
26
+ title={Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight},
27
+ author={Yang, Yi and Li, Xueqi and Chen, Yiyang and Song, Jin and Wang, Yihan and Xiao, Zipeng and Su, Jiadi and Qiaoben, You and Liu, Pengfei and Deng, Zhijie},
28
+ journal={arXiv preprint arXiv:2511.16175},
29
+ year={2025}
30
+ }
31
+ ```