| datasets: | |
| - reasonseg | |
| language: en | |
| license: other | |
| pipeline_tag: image-segmentation | |
| library_name: transformers | |
| tags: | |
| - vision | |
| - segmentation | |
| # Seg-Zero-7B | |
| This model is based on the paper [Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement](https://huggingface.co/papers/2503.06520). It uses a decoupled architecture with a reasoning model and a segmentation model. It's trained via reinforcement learning using GRPO without explicit reasoning data, leading to robust zero-shot generalization and emergent test-time reasoning. | |
| Code: https://github.com/dvlab-research/Seg-Zero | |
| ## Description | |
| This is a Seg-Zero-7B model. It introduces a decoupled architecture consisting of a reasoning model and a segmentation model. The reasoning model interprets user intentions, generates explicit reasoning chains, and produces positional prompts, which are subsequently used by the segmentation model to generate pixel-level masks. | |
| ## Usage | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| import torch | |
| # load model | |
| model = AutoModelForCausalLM.from_pretrained("Ricky06662/Seg-Zero-7B") | |
| tokenizer = AutoTokenizer.from_pretrained("Ricky06662/Seg-Zero-7B") | |
| ``` | |
| ## Installation | |
| ```bash | |
| git clone https://github.com/dvlab-research/Seg-Zero.git | |
| cd Seg-Zero | |
| conda create -n seg_zero python=3.11 | |
| conda activate seg_zero | |
| pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 | |
| pip install -e . | |
| pip install sam2 | |
| pip install matplotlib | |
| ``` | |
| ## Inference | |
| ```bash | |
| python inference_scripts/infer.py | |
| ``` | |
| The default question is: | |
| > "the unusual object in the image." | |
| You will get the thinking process in the command line and the mask will be saved in the **inference_scripts** folder. You can also provide your own image_path and text: | |
| ```bash | |
| python inference_scripts/infer.py --image_path "your_image_path" --text "your question text" | |
| ``` |