Ricky06662
/

Seg-Zero-7B

Image Segmentation

text-generation-inference

Model card Files Files and versions

Seg-Zero-7B / README.md

Ricky06662's picture

Improve model card and add metadata (#1)

7764ee8 verified 8 months ago

|

history blame contribute delete

1.89 kB

	---
	datasets:
	- reasonseg
	language: en
	license: other
	pipeline_tag: image-segmentation
	library_name: transformers
	tags:
	- vision
	- segmentation
	---

	# Seg-Zero-7B

	This model is based on the paper [Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement](https://huggingface.co/papers/2503.06520). It uses a decoupled architecture with a reasoning model and a segmentation model. It's trained via reinforcement learning using GRPO without explicit reasoning data, leading to robust zero-shot generalization and emergent test-time reasoning.

	Code: https://github.com/dvlab-research/Seg-Zero

	## Description

	This is a Seg-Zero-7B model. It introduces a decoupled architecture consisting of a reasoning model and a segmentation model. The reasoning model interprets user intentions, generates explicit reasoning chains, and produces positional prompts, which are subsequently used by the segmentation model to generate pixel-level masks.

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	# load model
	model = AutoModelForCausalLM.from_pretrained("Ricky06662/Seg-Zero-7B")
	tokenizer = AutoTokenizer.from_pretrained("Ricky06662/Seg-Zero-7B")
	```

	## Installation

	```bash
	git clone https://github.com/dvlab-research/Seg-Zero.git
	cd Seg-Zero
	conda create -n seg_zero python=3.11
	conda activate seg_zero
	pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1
	pip install -e .
	pip install sam2
	pip install matplotlib
	```

	## Inference

	```bash
	python inference_scripts/infer.py
	```

	The default question is:

	> "the unusual object in the image."

	You will get the thinking process in the command line and the mask will be saved in the inference_scripts folder. You can also provide your own image_path and text:

	```bash
	python inference_scripts/infer.py --image_path "your_image_path" --text "your question text"
	```