Skip to content

NTU-AI4X/ConceptSeg-R1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ConceptSeg-R1

ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning

arXiv Project Page HuggingFace Dataset License Stars

IntroductionGet StartedDataCheckpoints

🎬 Short Video

📰 News

  • May 2026 — arXiv paper released 🎉

🗺️ Roadmap

Status Item
arXiv paper
Training code
Testing code
CI-CD-CR datasets
ConceptSeg-R1 (7B weights)
Release larger MLLM weights, e.g., ConceptSeg-R1-32B,ConceptSeg-R1-72B

Introduction

🌍 As segmentation in computer vision shifts from objects to concepts,

🚀 ConceptSeg-R1 takes the first step toward segmenting any concept.


Key Contributions

  • 🌳 From Objects to Concepts
    We introduce a three-level concept hierarchy covering CI, CD, and CR concepts, pushing segmentation beyond category recognition.

  • 🔁 From Instance Solving to Rule Induction
    Meta-GRPO enables the model to infer transferable task rules from visual demonstrations and apply them deductively to unseen queries.

  • 🔗 Latent Concept Tokens for Frozen SAM 3
    We map MLLM reasoning states into implicit concept tokens in the SAM 3 prompt space, enabling reasoning-aware segmentation without fine-tuning SAM 3.

  • ⚡ From Heavy Reasoning to Adaptive Inference
    The Shortcut Router dynamically balances SAM 3 efficiency and reasoning depth, enabling fast perception for simple cases and deeper reasoning for complex concepts.

Results

Concept Segmentation Benchmarks (CI / CD / CR)


Cityscapes Performance (Zero-Shot)


ReasonSeg Performance (Zero-Shot)

Qualitative Comparison


Concept Coexistence


Get Started

1. Environment Setup

1. Environment Setup

Before running setup.sh, download the release assets below from GitHub Releases and place them in the repository root:

  • sam3-main.zip: the modified SAM 3 package used by ConceptSeg-R1.
  • all_meta.json.zip: the training metadata file.
conda create -n conceptseg-r1 python=3.10
conda activate conceptseg-r1
bash setup.sh
2. Training

2. Training

Prepare data — Download the dataset, extract all_meta.json through setup.sh, and set your image_folders path in the shell scripts.

# Stage 1: SFT Training
bash run_grpo_multiimage_stage1.sh

# Stage 2: GRPO Training
# Note: Set `model_path` to the Stage 1 output checkpoint before running. (If you training encounter unexpected GPU OOM   despite sufficient VRAM,  try changing transformers_version to "4.49.0" in model_path/generation_config.json.)
bash run_grpo_multiimage_stage2.sh
3. Evaluation

3. Evaluation

Concept Segmentation — Download weights, set the model path in eval_conceptseg.sh, then run:

bash eval_conceptseg.sh

Tip: Configure specific tasks for testing inside eval_conceptseg.sh.

Reasoning Segmentation — Download weights, set the model path in eval_reasonseg.sh, then run:

bash eval_reasonseg.sh
4. Inference

4. Inference

Quick Start: The inference.sh script includes 4 test cases covering different usage scenarios.

# Test 4  cases
bash run_scripts/inference.sh

Single Example Inference — For quick testing and demonstration, use the inference script:

# Or test a specific case
python src/eval/inference_single_example.py \
    --model_path "path/to/model" \
    --infer_path "path/to/image" \
    --question "concept description" \
    --output_path "output/path"

Supported Input Modes:

  • Single Image: Basic concept segmentation with text prompt (set --ref_path and --bbox to empty)
  • Multiple Images: Reference-guided segmentation with visual reasoning (set `--ref_path)
  • Bounding Boxes: Precise reference region specification for complex concepts (set `--bbox)

Data

all_meta.json is no longer tracked in this repository. Download all_meta.json.zip from GitHub Releases and run bash setup.sh to extract it before training.

Place datasets under a shared root directory (image_folders):

root/
├── isic2018/
├── rare/
├── Breast_Tumor/
├── transparent1024/
├── MGrounding-630k/
├── Polyp/
├── Shadow_detection/
├── MIG-Bench/
├── coco2014_Living/
├── CoSOD3k1024/
├── ultra_rare/
├── coco2014_Artifact/
├── fewshot1000/
├── DUTS/
├── ESDIDefects/
└── COD10K1024/

Metric

Evaluation uses the PySegMetric_EvalToolkit.

Datasets & Checkpoints

Resource Link
📦 ConceptSeg-Benchmark Dataset Download on HuggingFace
🤖 ConceptSeg-R1-7B Weights Download on HuggingFace

Acknowledgements

We reference the excellent open-source repos SAM 3, VLM-R1 and LENS. Thanks to their authors for the valuable contributions to the community.

Citation

If you find this work useful, please consider starring ⭐ and citing the repo!

@misc{zhao2026conceptseg,
      title={ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning}, 
      author={Yuan Zhao and Youwei Pang and Jiaming Zuo and Wei Ji and Kailai Zhou and Bin Fan and Yunkang Cao and Lihe Zhang and Xiaofeng Liu and Huchuan Lu and Weisi Lin and Dacheng Tao and Xiaoqi Zhao},
      year={2026},
      eprint={2605.20385},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2605.20385}, 
}