Apply for community grant: Academic project (gpu and storage)

#2
by fuvty - opened

Cache-to-Cache (C2C) enables Large Language Models to communicate directly through their KV-Caches, bypassing text generation. By projecting and fusing KV-Caches between models, C2C achieves 8.5โ€“10.5% higher accuracy than individual models and 3.0โ€“5.0% better performance than text-based communication, with 2.0ร— speedup in latency. Thank you so much for your help and support!
It earns much attention on X: https://x.com/jiqizhixin/status/1985219136000299215

Sign up or log in to comment