asi-v25-live-demo / README.md
khopilot's picture
ZEROGPU H200 READY
bd101d6

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: ASI V2.5 ZeroGPU Demo
emoji: πŸš€
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit

πŸš€ ASI V2.5: ZeroGPU H200 Performance Demo

REAL GPU ASI V2.5 Testing on NVIDIA H200!

This Space demonstrates the Adaptive Structured Intelligence V2.5 attention mechanism running on ZeroGPU H200 with 70GB VRAM - finally showing the real speedups ASI was designed for!

✨ Features

  • πŸš€ ZeroGPU H200: NVIDIA H200 with 70GB VRAM (FREE with Pro!)
  • πŸ”₯ Real GPU Performance: Actual CUDA optimizations and tensor cores
  • πŸ“ˆ Long Sequences: Test up to 8192 tokens (impossible on CPU)
  • ⚑ Mixed Precision: FP16 optimization for H200
  • 🎯 Adaptive Attention: Linear O(L) for long sequences, Exact O(LΒ²) for short

🎯 Performance Expectations

Sequence Length Expected Speedup Attention Type
1024 tokens 1.2-1.5x Linear
2048 tokens 1.5-2.0x Linear
4096 tokens 2.0-2.5x Linear
8192 tokens 2.5x+ Linear

πŸ”§ How to Use

  1. Configure ASI parameters (threshold, feature_dim, etc.)
  2. Set sequence lengths to test (try long ones!)
  3. Click "Run ZeroGPU ASI V2.5 Test"
  4. See REAL GPU speedups!

πŸ’‘ Why ZeroGPU?

  • βœ… FREE with HuggingFace Pro ($9/mois)
  • βœ… 70GB VRAM - handles very long sequences
  • βœ… NVIDIA H200 - latest architecture
  • βœ… 8x quota with Pro priority

πŸ† Technical Details

  • ASI V2.5: Adaptive attention switching
  • CUDA Optimized: Tensor cores + mixed precision
  • Memory Efficient: 70GB VRAM utilization
  • Production Ready: Real deployment example

πŸ“Š Links


Finally - ASI V2.5 running on real GPU with the speedups it was designed for! πŸš€