Papers
arxiv:2510.05875

LARA-Gen: Enabling Continuous Emotion Control for Music Generation Models via Latent Affective Representation Alignment

Published on Oct 7
Authors:
,
,
,
,
,
,

Abstract

LARA-Gen framework enables continuous emotion control in text-to-music generation by aligning internal states with an external model and using a valence-arousal space.

AI-generated summary

Recent advances in text-to-music models have enabled coherent music generation from text prompts, yet fine-grained emotional control remains unresolved. We introduce LARA-Gen, a framework for continuous emotion control that aligns the internal hidden states with an external music understanding model through Latent Affective Representation Alignment (LARA), enabling effective training. In addition, we design an emotion control module based on a continuous valence-arousal space, disentangling emotional attributes from textual content and bypassing the bottlenecks of text-based prompting. Furthermore, we establish a benchmark with a curated test set and a robust Emotion Predictor, facilitating objective evaluation of emotional controllability in music generation. Extensive experiments demonstrate that LARA-Gen achieves continuous, fine-grained control of emotion and significantly outperforms baselines in both emotion adherence and music quality. Generated samples are available at https://nieeim.github.io/LARA-Gen/.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.05875 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.05875 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.05875 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.