Cosmos-Reason2-8B-GGUF
GGUF quantizations of nvidia/Cosmos-Reason2-8B for use with llama.cpp and compatible tools.
About the Model
NVIDIA Cosmos Reason 2 is an open, 8B-parameter reasoning vision-language model (VLM) for physical AI and robotics. It is post-trained from Qwen3-VL-8B-Instruct and understands space, time, and fundamental physics.
Key capabilities:
- Physical AI reasoning with spatio-temporal understanding
- Object detection with 2D/3D point localization and bounding boxes
- Long-context understanding up to 256K input tokens
- Video analytics, data curation, and robot planning
For full details, see the original model card.
Quantization Details
| File | Quant | Size |
|---|---|---|
Cosmos-Reason2-8B-F16.gguf |
F16 | 16 GB |
Cosmos-Reason2-8B-Q8_0.gguf |
Q8_0 | 8.2 GB |
Cosmos-Reason2-8B-Q4_K_M.gguf |
Q4_K_M | 4.7 GB |
mmproj-Cosmos-Reason2-8B-F16.gguf |
F16 | 1.1 GB |
Note: The vision encoder (
mmproj) is kept at F16 precision.
How to Use
llama-server -hf Kbenkhaled/Cosmos-Reason2-8B-GGUF:Q8_0
llama-server -hf Kbenkhaled/Cosmos-Reason2-8B-GGUF:F16
llama-server -hf Kbenkhaled/Cosmos-Reason2-8B-GGUF:Q4_K_M
- Downloads last month
- 680
Hardware compatibility
Log In to add your hardware
4-bit
8-bit
16-bit