 GarrickLin
			's Collections
			GarrickLin
			's Collections
			
			
				
				
 - HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge
  in RAG Systems- 
			Paper
			 •- 
			2411.02959
			 •
			Published
				
			•- 
				70
			 
 - "Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM
  Quantization- 
			Paper
			 •- 
			2411.02355
			 •
			Published
				
			•- 
				51
			 
 - CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation
  Generation- 
			Paper
			 •- 
			2410.23090
			 •
			Published
				
			•- 
				55
			 
 - RARe: Retrieval Augmented Retrieval with In-Context Examples- 
			Paper
			 •- 
			2410.20088
			 •
			Published
				
			•- 
				4
			 
 - Breaking the Memory Barrier: Near Infinite Batch Size Scaling for
  Contrastive Loss- 
			Paper
			 •- 
			2410.17243
			 •
			Published
				
			•- 
				93
			 
 - Can Knowledge Editing Really Correct Hallucinations?- 
			Paper
			 •- 
			2410.16251
			 •
			Published
				
			•- 
				55
			 
 - LOGO -- Long cOntext aliGnment via efficient preference Optimization- 
			Paper
			 •- 
			2410.18533
			 •
			Published
				
			•- 
				43
			 
 - Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis
  from Scratch- 
			Paper
			 •- 
			2410.18693
			 •
			Published
				
			•- 
				42
			 
 - MotionCLR: Motion Generation and Training-free Editing via Understanding
  Attention Mechanisms- 
			Paper
			 •- 
			2410.18977
			 •
			Published
				
			•- 
				15
			 
 - Balancing Pipeline Parallelism with Vocabulary Parallelism- 
			Paper
			 •- 
			2411.05288
			 •
			Published
				
			•- 
				20
			 
 - M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page
  Multi-document Understanding- 
			Paper
			 •- 
			2411.04952
			 •
			Published
				
			•- 
				30
			 
 - OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models- 
			Paper
			 •- 
			2411.04905
			 •
			Published
				
			•- 
				127
			 
 - Florence-VL: Enhancing Vision-Language Models with Generative Vision
  Encoder and Depth-Breadth Fusion- 
			Paper
			 •- 
			2412.04424
			 •
			Published
				
			•- 
				63
			 
 - VisionZip: Longer is Better but Not Necessary in Vision Language Models- 
			Paper
			 •- 
			2412.04467
			 •
			Published
				
			•- 
				118
			 
 - PaliGemma 2: A Family of Versatile VLMs for Transfer- 
			Paper
			 •- 
			2412.03555
			 •
			Published
				
			•- 
				133
			 
 - Imagine360: Immersive 360 Video Generation from Perspective Anchor- 
			Paper
			 •- 
			2412.03552
			 •
			Published
				
			•- 
				29
			 
 - RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented
  Generation for Preference Alignment- 
			Paper
			 •- 
			2412.13746
			 •
			Published
				
			•- 
				9
			 
 - Are Your LLMs Capable of Stable Reasoning?- 
			Paper
			 •- 
			2412.13147
			 •
			Published
				
			•- 
				94
			 
 - OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in
  Financial Domain- 
			Paper
			 •- 
			2412.13018
			 •
			Published
				
			•- 
				41
			 
 - VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal
  Retrieval-Augmented Generation- 
			Paper
			 •- 
			2412.10704
			 •
			Published
				
			•- 
				16
			 
 - SUGAR: Subject-Driven Video Customization in a Zero-Shot Manner- 
			Paper
			 •- 
			2412.10533
			 •
			Published
				
			•- 
				5
			 
 - When to Speak, When to Abstain: Contrastive Decoding with Abstention- 
			Paper
			 •- 
			2412.12527
			 •
			Published
				
			•- 
				4
			 
 - RetroLLM: Empowering Large Language Models to Retrieve Fine-grained
  Evidence within Generation- 
			Paper
			 •- 
			2412.11919
			 •
			Published
				
			•- 
				36
			 
 - Smaller Language Models Are Better Instruction Evolvers- 
			Paper
			 •- 
			2412.11231
			 •
			Published
				
			•- 
				28
			 
 - VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video
  Face Swapping- 
			Paper
			 •- 
			2412.11279
			 •
			Published
				
			•- 
				13
			 
 - SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D
  Gaussians from Monocular Video- 
			Paper
			 •- 
			2412.09982
			 •
			Published
				
			•- 
				7
			 
 - TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot
  Learning- 
			Paper
			 •- 
			2412.10447
			 •
			Published
				
			•- 
				5
			 
 - Whisper-GPT: A Hybrid Representation Audio Large Language Model- 
			Paper
			 •- 
			2412.11449
			 •
			Published
				
			•- 
				4
			 
 - GeoX: Geometric Problem Solving Through Unified Formalized
  Vision-Language Pre-training- 
			Paper
			 •- 
			2412.11863
			 •
			Published
				
			•- 
				4
			 
 - Reliable, Reproducible, and Really Fast Leaderboards with Evalica- 
			Paper
			 •- 
			2412.11314
			 •
			Published
				
			•- 
				2
			 
 - GenEx: Generating an Explorable World- 
			Paper
			 •- 
			2412.09624
			 •
			Published
				
			•- 
				97
			 
 - 
			Paper
			 •- 
			2412.18653
			 •
			Published
				
			•- 
				84
			 
 - Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via
  Collective Monte Carlo Tree Search- 
			Paper
			 •- 
			2412.18319
			 •
			Published
				
			•- 
				39
			 
 - BoostStep: Boosting mathematical capability of Large Language Models via
  improved single-step reasoning- 
			Paper
			 •- 
			2501.03226
			 •
			Published
				
			•- 
				44
			 
 - SearchRAG: Can Search Engines Be Helpful for LLM-based Medical Question
  Answering?- 
			Paper
			 •- 
			2502.13233
			 •
			Published
				
			•- 
				15
			 
 - SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference- 
			Paper
			 •- 
			2502.18137
			 •
			Published
				
			•- 
				58
			 
 - GCC: Generative Color Constancy via Diffusing a Color Checker- 
			Paper
			 •- 
			2502.17435
			 •
			Published
				
			•- 
				28
			 
 - Self-rewarding correction for mathematical reasoning- 
			Paper
			 •- 
			2502.19613
			 •
			Published
				
			•- 
				83