faster-whisper-large-v3-int8-ct2 / README.md

groxaxo

Update README.md

a8d6ec0 verified about 1 month ago

preview code

raw

history blame contribute delete

2.21 kB

metadata

license: mit
base_model:
  - openai/whisper-large-v3
tags:
  - faster
  - whisper
  - faster-whisper
  - ct2
  - large
  - v3
  - int8

faster-whisper-large-v3-int8-ct2

This repository contains the OpenAI Whisper Large v3 model converted to the CTranslate2 format with int8 quantization.

This conversion makes the model significantly faster and more memory-efficient for inference, with a minimal trade-off in accuracy. CTranslate2 is an inference engine for Transformer models developed by OpenNMT.

Model Details

Base Model: openai/whisper-large-v3
Format: CTranslate2
Quantization: int8

How to Use

You can use this model with the faster-whisper library.

First, install faster-whisper:

pip install faster-whisper

Then, you can use the model in your Python code:

from faster_whisper import WhisperModel

model_path = "groxaxo/faster-whisper-large-v3-int8-ct2"

# Run on GPU with FP16
# model = WhisperModel(model_path, device="cuda", compute_type="float16")

# or run on GPU with INT8
# model = WhisperModel(model_path, device="cuda", compute_type="int8_float16")

# or run on CPU with INT8
model = WhisperModel(model_path, device="cpu", compute_type="int8")

segments, info = model.transcribe("audio.mp3", beam_size=5)

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

Replace "audio.mp3" with the path to your audio file.

Conversion

The model was converted using the ct2-transformers-converter tool from the CTranslate2 project.

ct2-transformers-converter --model openai/whisper-large-v3 --output_dir faster-whisper-large-v3-int8-ct2 --quantization int8 --copy_files tokenizer.json preprocessor_config.json

Disclaimer

Quantization can have a small impact on the model's accuracy. While int8 quantization is generally safe and provides a good balance of performance and accuracy, you should evaluate the model on your specific task to ensure it meets your requirements.