MLLM as a UI Judge: Benchmarking Multimodal LLMs for Predicting Human Perception of User Interfaces Paper • 2510.08783 • Published 14 days ago • 4 • 2
Learning to Route LLMs from Bandit Feedback: One Policy, Many Trade-offs Paper • 2510.07429 • Published 15 days ago • 3 • 2
The Photographer Eye: Teaching Multimodal Large Language Models to See and Critique like Photographers Paper • 2509.18582 • Published about 1 month ago • 2 • 1
mSCoRe: a $M$ultilingual and Scalable Benchmark for $S$kill-based $Co$mmonsense $Re$asoning Paper • 2508.10137 • Published Aug 13 • 2 • 2
Lizard: An Efficient Linearization Framework for Large Language Models Paper • 2507.09025 • Published Jul 11 • 18 • 1
A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality Paper • 2507.07202 • Published Jul 9 • 22 • 1
MS4UI: A Dataset for Multi-modal Summarization of User Interface Instructional Videos Paper • 2506.12623 • Published Jun 14 • 2 • 2
Forecasting Time Series with LLMs via Patch-Based Prompting and Decomposition Paper • 2506.12953 • Published Jun 15 • 2 • 2
LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles Paper • 2506.06561 • Published Jun 6 • 2 • 2
Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents Paper • 2506.01344 • Published Jun 2 • 5 • 2
A Graph Perspective to Probe Structural Patterns of Knowledge in Large Language Models Paper • 2505.19286 • Published May 25 • 3 • 2
Understanding Generative AI Capabilities in Everyday Image Editing Tasks Paper • 2505.16181 • Published May 22 • 24 • 2
Document Attribution: Examining Citation Relationships using Large Language Models Paper • 2505.06324 • Published May 9 • 3 • 2
InfoVids: Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships Paper • 2505.03164 • Published May 6 • 6 • 1
CORG: Generating Answers from Complex, Interrelated Contexts Paper • 2505.00023 • Published Apr 25 • 9 • 1