Generate a video with text highlighted as it's spoken
Generate detailed music descriptions from audio clips
Highlight sound sources in images using audio
西北工业大学ASLP实验室OSUM项目demo展示