AuriStream100M_1Pred_BigAudioDataset_500k

AuriStream is a speech language model by Greta Tuckute and Klemen Kotar.

This model predicts cochlear tokens from a tokenizer such as WavCochCausalV8192.

Model Details

Parameter Value
Parameters ~0.09B
Layers 12
Hidden Size 768
Attention Heads 12
Vocab Size 8192
Prediction Steps 1

Usage

from transformers import AutoModel, AutoConfig

# Load with trust_remote_code for custom model
model = AutoModel.from_pretrained(
    "TuKoResearch/AuriStream100M_1Pred_BigAudioDataset_500k",
    trust_remote_code=True,
)

# Or load config first
config = AutoConfig.from_pretrained("TuKoResearch/AuriStream100M_1Pred_BigAudioDataset_500k", trust_remote_code=True)

Base Model Code

This checkpoint uses shared model code from TuKoResearch/AuriStream-base.

Tokenizer

This model uses cochlear tokens from WavCochCausalV8192.

Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support