 nics-efc
			's Collections
			nics-efc
			's Collections
			
			
		Papers from the NICS-EFFALG Team
		
	updated
			
 
				
				
 - R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large
  Model Token Routing- 
			Paper
			 •- 
			2505.21600
			 •
			Published
				
			•- 
				70
			 
 - Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models
  with Flow Matching- 
			Paper
			 •- 
			2412.17153
			 •
			Published
				
			•- 
				39
			 
 - Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding- 
			Paper
			 •- 
			2307.15337
			 •
			Published
				
			•- 
				38
			 
 - DiTFastAttn: Attention Compression for Diffusion Transformer Models- 
			Paper
			 •- 
			2406.08552
			 •
			Published
				
			•- 
				25
			 
 - Can LLMs Learn by Teaching? A Preliminary Study- 
			Paper
			 •- 
			2406.14629
			 •
			Published
				
			•- 
				21
			 
 - MoA: Mixture of Sparse Attention for Automatic Large Language Model
  Compression- 
			Paper
			 •- 
			2406.14909
			 •
			Published
				
			•- 
				16
			 
 - A Survey on Efficient Inference for Large Language Models- 
			Paper
			 •- 
			2404.14294
			 •
			Published
				
			•- 
				3
			 
 - ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers
  for Image and Video Generation- 
			Paper
			 •- 
			2406.02540
			 •
			Published
				
			•- 
				3
			 
 - MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with
  Metric-Decoupled Mixed Precision Quantization- 
			Paper
			 •- 
			2405.17873
			 •
			Published
				
			•- 
				3
			 
 - FrameFusion: Combining Similarity and Importance for Video Token
  Reduction on Large Visual Language Models- 
			Paper
			 •- 
			2501.01986
			 •
			Published
				
			•- 
				1
			 
 - Evaluating Quantized Large Language Models- 
			Paper
			 •- 
			2402.18158
			 •
			Published
 - Cache-to-Cache: Direct Semantic Communication Between Large Language
  Models- 
			Paper
			 •- 
			2510.03215
			 •
			Published
				
			•- 
				92