Abhirama Subramanyam

abhiram4572

http://abhiram4572.github.io

AI & ML interests

Multimodal deep learning

Recent Activity

authored a paper about 1 month ago

When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs

upvoted a paper about 1 month ago

When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs

commented on a paper about 1 month ago

When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs

View all activity

Organizations

authored a paper about 1 month ago

When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs

Paper • 2509.16633 • Published Sep 20 • 1

upvoted a paper about 1 month ago

When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs

Paper • 2509.16633 • Published Sep 20 • 1

commented a paper about 1 month ago

When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs

Paper • 2509.16633 • Published Sep 20 • 1 •

upvoted 4 papers about 1 month ago

COFAR: Commonsense and Factual Reasoning in Image Search

Paper • 2210.08554 • Published Oct 16, 2022 • 1

Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question Answering

Paper • 2306.16713 • Published Jun 29, 2023 • 1

Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant

Paper • 2410.19144 • Published Oct 24, 2024 • 1

Mind the (Language) Gap: Towards Probing Numerical and Cross-Lingual Limits of LVLMs

Paper • 2508.17334 • Published Aug 24 • 2

New activity in DIALab/MMCricBench about 2 months ago

[bot] Conversion to Parquet

#1 opened 2 months ago by

parquet-converter

authored 4 papers 2 months ago

Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant

Paper • 2410.19144 • Published Oct 24, 2024 • 1

Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question Answering

Paper • 2306.16713 • Published Jun 29, 2023 • 1

COFAR: Commonsense and Factual Reasoning in Image Search

Paper • 2210.08554 • Published Oct 16, 2022 • 1

Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification

Paper • 2211.12926 • Published Nov 23, 2022

upvoted a paper 2 months ago

Audiopedia: Audio QA with Knowledge

Paper • 2412.20619 • Published Dec 29, 2024 • 1

authored 2 papers 2 months ago

Mind the (Language) Gap: Towards Probing Numerical and Cross-Lingual Limits of LVLMs

Paper • 2508.17334 • Published Aug 24 • 2

Audiopedia: Audio QA with Knowledge

Paper • 2412.20619 • Published Dec 29, 2024 • 1

New activity in DIALab/MMCricBench 2 months ago

Improve dataset card: Add paper link and task category

#2 opened 2 months ago by

nielsr

updated a dataset 2 months ago

DIALab/MMCricBench

Viewer • Updated Aug 27 • 3k • 3.95k • 3

liked a dataset 2 months ago

DIALab/MMCricBench

Viewer • Updated Aug 27 • 3k • 3.95k • 3

published a dataset 2 months ago

DIALab/MMCricBench

Viewer • Updated Aug 27 • 3k • 3.95k • 3

upvoted an article 8 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

Feb 20

• 309

Abhirama Subramanyam

AI & ML interests

Recent Activity

Organizations

abhiram4572's activity

[bot] Conversion to Parquet

Improve dataset card: Add paper link and task category

SmolVLM2: Bringing Video Understanding to Every Device