GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation Paper • 2311.07562 • Published Nov 13, 2023 • 15
Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations Paper • 2303.17839 • Published Mar 31, 2023
Learning Concise and Descriptive Attributes for Visual Recognition Paper • 2308.03685 • Published Aug 7, 2023