Spaces:
Sleeping
Sleeping
Apply for community grant: Academic project (gpu and storage)
#2
by
fuvty
- opened
Cache-to-Cache (C2C) enables Large Language Models to communicate directly through their KV-Caches, bypassing text generation. By projecting and fusing KV-Caches between models, C2C achieves 8.5โ10.5% higher accuracy than individual models and 3.0โ5.0% better performance than text-based communication, with 2.0ร speedup in latency. Thank you so much for your help and support!
It earns much attention on X: https://x.com/jiqizhixin/status/1985219136000299215