ESPnet2 ASR model
	
		
	
	
		espnet/shihlun_asr_whisper_medium_finetuned_chime4
	
This model was trained by Shih-Lun Wu (slseanwu) using the chime4 recipe in espnet.
Demo: How to use in ESPnet2
cd espnet
pip install -e .
cd egs2/chime4/asr1
train_set=tr05_multi_noisy_si284 # tr05_multi_noisy (original training data) or tr05_multi_noisy_si284 (add si284 data)
valid_set=dt05_multi_isolated_1ch_track
test_sets="dt05_real_isolated_1ch_track dt05_simu_isolated_1ch_track et05_real_isolated_1ch_track et05_simu_isolated_1ch_track"
asr_tag=whisper_medium_finetune_lr1e-5_adamw_wd1e-2_3epochs
asr_config=conf/tuning/train_asr_whisper_full.yaml
inference_config=conf/decode_asr_whisper_noctc_greedy.yaml
./asr.sh \
    --skip_data_prep false \
    --skip_train true \
    --skip_eval false \
    --lang en \
    --ngpu 1 \
    --nj 4 \
    --stage 1 \
    --stop_stage 13 \
    --gpu_inference true \
    --inference_nj 1 \
    --token_type whisper_multilingual \
    --feats_normalize '' \
    --max_wav_duration 30 \
    --feats_type raw \
    --use_lm false \
    --cleaner whisper_en \
    --asr_tag "${asr_tag}" \
    --asr_config "${asr_config}" \
    --inference_config "${inference_config}" \
    --inference_asr_model valid.acc.ave.pth \
    --train_set "${train_set}" \
    --valid_set "${valid_set}" \
    --test_sets "${test_sets}" "$@"
RESULTS
Environments
- date: Tue Jan 10 04:15:30 CST 2023
- python version: 3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0]
- espnet version: espnet 202211
- pytorch version: pytorch 1.12.1
- Git hash: d89be931dcc8f61437ac49cbe39a773f2054c50c- Commit date: Mon Jan 9 11:06:45 2023 -0600
 
- Commit date: 
asr_whisper_medium_finetune_lr1e-5_adamw_wd1e-2_3epochs
WER
| dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err | 
|---|---|---|---|---|---|---|---|---|
| decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/dt05_real_isolated_1ch_track | 1640 | 24791 | 97.8 | 1.7 | 0.5 | 0.3 | 2.5 | 24.5 | 
| decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/dt05_simu_isolated_1ch_track | 1640 | 24792 | 96.1 | 3.0 | 0.9 | 0.5 | 4.4 | 35.6 | 
| decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/et05_real_isolated_1ch_track | 1320 | 19341 | 96.4 | 2.9 | 0.7 | 0.5 | 4.1 | 33.0 | 
| decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/et05_simu_isolated_1ch_track | 1320 | 19344 | 93.4 | 5.0 | 1.7 | 0.8 | 7.4 | 41.8 | 
| decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/dt05_real_isolated_1ch_track | 1640 | 24791 | 97.7 | 1.8 | 0.5 | 0.4 | 2.8 | 25.5 | 
| decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/dt05_simu_isolated_1ch_track | 1640 | 24792 | 96.0 | 3.3 | 0.8 | 0.7 | 4.8 | 36.0 | 
| decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/et05_real_isolated_1ch_track | 1320 | 19341 | 96.1 | 3.3 | 0.6 | 0.7 | 4.6 | 34.9 | 
| decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/et05_simu_isolated_1ch_track | 1320 | 19344 | 92.9 | 5.8 | 1.3 | 1.2 | 8.3 | 43.2 | 
CER
| dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err | 
|---|---|---|---|---|---|---|---|---|
| decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/dt05_real_isolated_1ch_track | 1640 | 141889 | 99.1 | 0.3 | 0.5 | 0.3 | 1.2 | 24.5 | 
| decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/dt05_simu_isolated_1ch_track | 1640 | 141900 | 98.2 | 0.8 | 1.0 | 0.5 | 2.3 | 35.6 | 
| decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/et05_real_isolated_1ch_track | 1320 | 110558 | 98.5 | 0.7 | 0.8 | 0.5 | 1.9 | 33.0 | 
| decode_asr_whisper_noctc_beam20_asr_model_valid.acc.ave/et05_simu_isolated_1ch_track | 1320 | 110572 | 96.5 | 1.6 | 1.9 | 0.8 | 4.3 | 41.8 | 
| decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/dt05_real_isolated_1ch_track | 1640 | 141889 | 99.1 | 0.4 | 0.5 | 0.5 | 1.3 | 25.5 | 
| decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/dt05_simu_isolated_1ch_track | 1640 | 141900 | 98.2 | 0.9 | 0.9 | 0.6 | 2.4 | 36.0 | 
| decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/et05_real_isolated_1ch_track | 1320 | 110558 | 98.4 | 0.9 | 0.7 | 0.6 | 2.2 | 34.9 | 
| decode_asr_whisper_noctc_greedy_asr_model_valid.acc.ave/et05_simu_isolated_1ch_track | 1320 | 110572 | 96.3 | 2.0 | 1.7 | 1.2 | 4.9 | 43.2 | 
- Downloads last month
- 1
	Inference Providers
	NEW
	
	
	This model isn't deployed by any Inference Provider.
	๐
			
		Ask for provider support
