When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs Paper • 2509.16633 • Published Sep 20 • 1
When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs Paper • 2509.16633 • Published Sep 20 • 1
When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs Paper • 2509.16633 • Published Sep 20 • 1 • 2
COFAR: Commonsense and Factual Reasoning in Image Search Paper • 2210.08554 • Published Oct 16, 2022 • 1
Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question Answering Paper • 2306.16713 • Published Jun 29, 2023 • 1
Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant Paper • 2410.19144 • Published Oct 24, 2024 • 1
Mind the (Language) Gap: Towards Probing Numerical and Cross-Lingual Limits of LVLMs Paper • 2508.17334 • Published Aug 24 • 2
Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant Paper • 2410.19144 • Published Oct 24, 2024 • 1
Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question Answering Paper • 2306.16713 • Published Jun 29, 2023 • 1
COFAR: Commonsense and Factual Reasoning in Image Search Paper • 2210.08554 • Published Oct 16, 2022 • 1
Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification Paper • 2211.12926 • Published Nov 23, 2022
Mind the (Language) Gap: Towards Probing Numerical and Cross-Lingual Limits of LVLMs Paper • 2508.17334 • Published Aug 24 • 2