You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

GROOT Condiment Handover Model - Step 2000

Model Card Summary

  • Checkpoint: Step 2000 (Final checkpoint)
  • Base Model: nvidia/GR00T-N1.5-3B
  • Task: Condiment handover on ASGARD so101_follower robot
  • Training Status: Completed successfully
  • Final Loss: ~0.006

Model Details

Model Architecture

This is a fine-tuned NVIDIA GR00T N1.5-3B model specifically trained for condiment handover tasks.

  • Model Type: GROOT (Generalist Robot 00 Technology)
  • Policy Type: GR00T N1.5-3B
  • Robot Embodiment: asgard_so101 (single-arm 6 degrees of freedom)
  • Action Dimensions: 6 (joint positions + gripper)
  • Observation: Dual camera RGB (640×480×3 each)

Training Components

Frozen (Not Trained):

  • ❌ LLM (tune_llm=false) - Language model kept frozen
  • ❌ Vision Encoder (tune_visual=false) - Visual features frozen

Trainable Components:

  • ✅ Diffusion Transformer (tune_diffusion_model=true) - Action generation
  • ✅ Projector (tune_projector=true) - Vision-language to action mapping

Training Strategy

  • Approach: Full fine-tuning (no LoRA)
  • Rationale: 4× H100 GPUs with 320GB total VRAM allows full parameter updates
  • Precision: bf16 (mixed precision training)

Training Details

Dataset Information

Parameter Value Description
Dataset Repository asgard-robot/asgard_training_data_condiment Hugging Face dataset
Dataset Version v3.0 LeRobot format tag
Total Episodes 40 Number of demonstrations
Total Frames 31,522 Total training samples
Avg Frames/Episode ~788 Average trajectory length
Episode Duration ~26 seconds At 30 FPS
Robot Type so101_follower Single-arm 6 DOF
Task Condiment handover Primary objective
Format LeRobot v3.0 Parquet + MP4 videos (AV1 codec)

Training Hyperparameters

Parameter Value Justification
Total Training Steps 2,000 Full training cycle
Number of Epochs ~32 Effective epochs (31,522 frames ÷ 512 batch)
Checkpoints Saved 5 Steps: 400, 800, 1200, 1600, 2000
Learning Rate 1e-4 GROOT recommended value
Weight Decay 1e-5 L2 regularization
Gradient Clip Norm 1.0 Training stability
Warmup Ratio 0.05 Gradual learning rate ramp
Batch Size (per GPU) 128 Maximum VRAM utilization
Effective Batch Size 512 128 × 4 GPUs
Num Workers 16 DataLoader parallel loading
Video Backend torchcodec AV1 codec decoder
Mixed Precision bf16 Memory efficient training

Hardware Configuration

Component Specification Utilization
GPUs 4× NVIDIA H100 PCIe All 4 GPUs used
VRAM per GPU 80GB ~79.65GB usable
Total VRAM 320GB Peak usage: ~60-70GB per GPU
CPUs 124 AMD EPYC 9554 (64-Core) Data loading
System RAM 708GB Adequate for data loading
Storage 1.5TB ephemeral Checkpoint storage

Usage

Load Model

from lerobot import Policy

policy = Policy.from_pretrained("asgard-robot/groot-condiment-handover")

Run Inference

# The model expects observations with:
# - observation.images.wrist1: RGB camera (640×480×3)
# - observation.images.realsense: RGB camera (640×480×3)
# - observation.state: 6D joint positions

action = policy(observation)
# Returns: 6D action space (joint positions + gripper)

Action Space

The model outputs actions for 6 degrees of freedom:

  1. shoulder_pan.pos
  2. shoulder_lift.pos
  3. elbow_flex.pos
  4. wrist_flex.pos
  5. wrist_roll.pos
  6. gripper.pos

Citation

@software{groot_condiment_model_2024,
  author = {ASGARD Team},
  title = {GROOT Condiment Handover Model - Step 2000},
  model = {asgard-robot/groot-condiment-handover},
  year = {2024},
  month = {October},
  checkpoint = {2000},
  base_model = {nvidia/GR00T-N1.5-3B},
  dataset = {asgard-robot/asgard_training_data_condiment},
  training_hardware = {4× NVIDIA H100 PCIe GPUs}
}

Acknowledgments

  • Base Model: NVIDIA GR00T N1.5-3B
  • Framework: LeRobot (ASGARD teleop control branch)
  • Dataset: ASGARD Robot Datasets
  • Hardware: Shadeform H100 Multi-GPU Cluster
Downloads last month
-
Safetensors
Model size
2B params
Tensor type
F32
·
BF16
·
Video Preview
loading

Model tree for asgard-robot/groot-condiment-handover

Finetuned
(31)
this model

Dataset used to train asgard-robot/groot-condiment-handover

Evaluation results