Cseti
/

VibeVoice_7B_Diffusion-head-LoRA_Hungarian-CV17

Model card Files Files and versions

VibeVoice_7B_Diffusion-head-LoRA_Hungarian-CV17 / README.md

Cseti's picture

Update README.md

80a2efc verified 2 months ago

|

history blame contribute delete

1.68 kB

	---
	base_model:
	- aoi-ot/VibeVoice-Large
	tags:
	- text-to-speech
	- tts
	- lora
	- vibevice
	datasets:
	- mozilla-foundation/common_voice_17_0
	language:
	- hu
	---
	# VibeVoice_7B_Diffusion-head-LoRA_Hungarian-CV17
	This is a VibeVoice 7B (Large) model LoRA finetune on a Hungarian audio dataset.
	For this particular test I used the CommonVoice 17.0 dataset's Hungarian config's train split.

	To finetune the model I used the [following code base](https://github.com/voicepowered-ai/VibeVoice-finetuning).

	Thank you for [JPGallegoar](https://github.com/jpgallegoar-vpai) for that amazing VibeVoice trainer!

	## Inference
	To use the LoRA model you can use [my modified fork](https://github.com/cseti007/VibeVoice)
	until the [following PR](https://github.com/vibevoice-community/VibeVoice/pull/6)
	will be merged into the main branch of [VibeVoice Community's repository](https://github.com/vibevoice-community/VibeVoice).

	## Examples

	Voice without LoRA
	<div style="display: flex; gap: 20px;">
	<audio controls src="https://huggingface.co/Cseti/VibeVoice_7B_Diffusion-head-LoRA_Hungarian-CV17/resolve/main/assets/synth_s42_nolora-1.wav"></audio>
	<audio controls src="https://huggingface.co/Cseti/VibeVoice_7B_Diffusion-head-LoRA_Hungarian-CV17/resolve/main/assets/synth_s98765_nolora-1.wav"></audio>
	</div>


	Voice WITH LoRA
	<div style="display: flex; gap: 20px;">
	<audio controls src="https://huggingface.co/Cseti/VibeVoice_7B_Diffusion-head-LoRA_Hungarian-CV17/resolve/main/assets/synth_hu-lora_srand3.wav"></audio>
	<audio controls src="https://huggingface.co/Cseti/VibeVoice_7B_Diffusion-head-LoRA_Hungarian-CV17/resolve/main/assets/synth_s42_hu-lora-1.wav"></audio>
	</div>