|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
datasets: |
|
|
- openslr/librispeech_asr |
|
|
base_model: |
|
|
- facebook/hubert-base-ls960 |
|
|
--- |
|
|
|
|
|
# Improving Spoken Language Modeling with Phoneme Classification: A Simple Fine-tuning Approach |
|
|
|
|
|
**Paper:** https://arxiv.org/abs/2410.00025 |
|
|
Presented at EMNLP 2024. |
|
|
|
|
|
This branch contains the HuBERT model fine-tuned with phoneme classification on train-clean-100. |
|
|
See the companion repository: https://github.com/bootphon/spokenlm-phoneme. |
|
|
|
|
|
Use it like this: |
|
|
```python |
|
|
from phonslm import HuBERTPhoneme |
|
|
|
|
|
model = HuBERTPhoneme.from_pretrained("coml/hubert-phoneme-classification") |
|
|
``` |