Changli Tang's picture

1 2 20

Changli Tang

Changli

·

TCL606

AI & ML interests

Speech signal processing; video understanding; multi-modal LLM

Recent Activity

new activity 11 days ago

tsinghua-ee/video-SALMONN-2_plus_7B:Question about the base model for LoRA adapter

updated a dataset 26 days ago

tsinghua-ee/AVUTBenchmark

updated a model 26 days ago

tsinghua-ee/video-SALMONN-2_plus_72B

View all activity

Organizations

authored 3 papers 3 months ago

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

Paper • 2310.05863 • Published Oct 9, 2023 • 2

Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization

Paper • 2410.06682 • Published Oct 9, 2024

video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models

Paper • 2506.15220 • Published Jun 18 • 1

authored a paper 8 months ago

video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model

Paper • 2502.11775 • Published Feb 17 • 9

authored a paper over 1 year ago

video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models

Paper • 2406.15704 • Published Jun 22, 2024 • 6

authored a paper about 2 years ago

SALMONN: Towards Generic Hearing Abilities for Large Language Models

Paper • 2310.13289 • Published Oct 20, 2023 • 17