Improve model card: Add pipeline tag, library_name, paper, code, usage, and additional tags
#1
by
nielsr
HF Staff
- opened
This PR significantly enhances the model card for Senqiao/VisionThink-General by:
- Adding
pipeline_tag: image-text-to-textto enable better discoverability for multimodal tasks on the Hugging Face Hub. - Specifying
library_name: transformersas the model is compatible with the Hugging Face Transformers library. - Including additional relevant
tagssuch asvision-language-model,multimodal, andqwen. - Providing a detailed description of the model, summarizing its core contributions from the paper.
- Including a direct link to the official paper: VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning.
- Adding a direct link to the official GitHub repository for the code: https://github.com/dvlab-research/VisionThink.
- Incorporating key highlights of the model's capabilities.
- Adding installation instructions and a practical Python code snippet for quick inference using
transformers. - Including the citation and acknowledgement sections.
These additions will make the model more discoverable, informative, and user-friendly for researchers and practitioners.