Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies Paper • 2308.03188 • Published Aug 6, 2023 • 2
Let's Think Frame by Frame: Evaluating Video Chain of Thought with Video Infilling and Prediction Paper • 2305.13903 • Published May 23, 2023
Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings Paper • 2305.02317 • Published May 3, 2023
WikiWhy: Answering and Explaining Cause-and-Effect Questions Paper • 2210.12152 • Published Oct 21, 2022 • 2
Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error Synthesis Paper • 2210.05035 • Published Oct 10, 2022
TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation Paper • 2406.08656 • Published Jun 12, 2024 • 8
Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts Paper • 2406.16851 • Published Jun 24, 2024