video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models
Official model release of video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models
Results
- Downloads last month
 - 155
 
	Inference Providers
	NEW
	
	
	
	This model isn't deployed by any Inference Provider.
	🙋
			
		Ask for provider support
Model tree for tsinghua-ee/video-SALMONN-2
Base model
Qwen/Qwen2-7B