Spaces:

khopilot
/

asi-v25-live-demo

Running

App Files Files Community

asi-v25-live-demo / README.md

khopilot

ZEROGPU H200 READY

bd101d6 4 months ago

preview code

raw

history blame contribute delete

2.21 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: ASI V2.5 ZeroGPU Demo
emoji: 🚀
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit

🚀 ASI V2.5: ZeroGPU H200 Performance Demo

REAL GPU ASI V2.5 Testing on NVIDIA H200!

This Space demonstrates the Adaptive Structured Intelligence V2.5 attention mechanism running on ZeroGPU H200 with 70GB VRAM - finally showing the real speedups ASI was designed for!

✨ Features

🚀 ZeroGPU H200: NVIDIA H200 with 70GB VRAM (FREE with Pro!)
🔥 Real GPU Performance: Actual CUDA optimizations and tensor cores
📈 Long Sequences: Test up to 8192 tokens (impossible on CPU)
⚡ Mixed Precision: FP16 optimization for H200
🎯 Adaptive Attention: Linear O(L) for long sequences, Exact O(L²) for short

🎯 Performance Expectations

Sequence Length	Expected Speedup	Attention Type
1024 tokens	1.2-1.5x	Linear
2048 tokens	1.5-2.0x	Linear
4096 tokens	2.0-2.5x	Linear
8192 tokens	2.5x+	Linear

🔧 How to Use

Configure ASI parameters (threshold, feature_dim, etc.)
Set sequence lengths to test (try long ones!)
Click "Run ZeroGPU ASI V2.5 Test"
See REAL GPU speedups!

💡 Why ZeroGPU?

✅ FREE with HuggingFace Pro ($9/mois)
✅ 70GB VRAM - handles very long sequences
✅ NVIDIA H200 - latest architecture
✅ 8x quota with Pro priority

🏆 Technical Details

ASI V2.5: Adaptive attention switching
CUDA Optimized: Tensor cores + mixed precision
Memory Efficient: 70GB VRAM utilization
Production Ready: Real deployment example

📊 Links

🤗 Model: khopilot/asi-v25-longformer-core
🐙 GitHub: khopilot/asi-v25-longformer-core
📦 PyPI: asi-v25-longformer

Finally - ASI V2.5 running on real GPU with the speedups it was designed for! 🚀