DiariZen
					Collection
				
DiariZen is a speaker diarization toolkit driven by AudioZen and Pyannote 3.1.
					• 
				4 items
				• 
				Updated
					
				
This hub features the pre-trained model by DiariZen as described in BUT System for the MLC-SLM Challenge. The EEND component is built upon WavLM-Large and Conformer layers. The model was pre-trained on far-field, single-channel audio from a diverse set of public datasets, including AMI, AISHELL-4, AliMeeting, NOTSOFAR-1, MSDWild, DIHARD3, RAMC, and VoxConverse. Then structured pruning at 80% sparsity is applied. Finally, the pruned model is fine-tuned with MLC-SLM data. When loading this model, please ensure non-commercial usage, in accordance with the CC BY-NC 4.0 license.
from diarizen.pipelines.inference import DiariZenPipeline
# load pre-trained model
diar_pipeline = DiariZenPipeline.from_pretrained("BUT-FIT/diarizen-wavlm-large-s80-mlc")
# apply diarization pipeline
diar_results = diar_pipeline('audio.wav')
# print results
for turn, _, speaker in diar_results.itertracks(yield_label=True):
    print(f"start={turn.start:.1f}s stop={turn.end:.1f}s speaker_{speaker}")
# load pre-trained model and save RTTM result
diar_pipeline = DiariZenPipeline.from_pretrained(
        "BUT-FIT/diarizen-wavlm-large-s80-mlc",
        rttm_out_dir='.'
)
# apply diarization pipeline
diar_results = diar_pipeline('audio.wav', sess_name='session_name')
DER evaluation of Pyannote baseline and DiariZen, with no collar applied.
| Dataset | Pyannote | DiariZen | 
|---|---|---|
| English-American | 20.18 | 15.88 | 
| English-Australian | 13.76 | 10.82 | 
| English-British | 18.85 | 12.07 | 
| English-Filipino | 13.19 | 10.28 | 
| English-Indian | 8.19 | 6.04 | 
| French | 22.62 | 17.33 | 
| German | 22.33 | 16.35 | 
| Italian | 10.64 | 8.85 | 
| Japanese | 26.46 | 17.81 | 
| Korean | 23.25 | 16.36 | 
| Portuguese | 17.60 | 14.77 | 
| Russian | 11.37 | 9.99 | 
| Spanish | 12.92 | 10.82 | 
| Thai | 10.90 | 10.62 | 
| Vietnamese | 14.64 | 12.69 | 
| Average | 16.44 | 12.71 | 
If you found this work helpful, please consider citing:
@article{polok2025but,
  title={BUT System for the MLC-SLM Challenge},
  author={Polok, Alexander and Han, Jiangyu and Klement, Dominik and Cornell, Samuele and {\v{C}}ernock{\`y}, Jan and Burget, Luk{\'a}{\v{s}}},
  journal={arXiv preprint arXiv:2506.13414},
  year={2025}
}