yueyulin commited on
Commit
ab1404a
·
verified ·
1 Parent(s): 7e15c8e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -0
README.md CHANGED
@@ -13,3 +13,10 @@ pipeline_tag: automatic-speech-recognition
13
 
14
  RWKV ASR is to add audio modality to RWKV7 model which means RWKV7 base model stays unaltered. The model trained a 0.1B rwkv model to convert whisper-large-v3 encoder's latents to RWKV7's latents space which convert the speech into texts according to the text instruction.
15
  This design keeps all abilities of LLM and is easy to add more functions to the model such as speech to speech, speech translation, etc. You name it!
 
 
 
 
 
 
 
 
13
 
14
  RWKV ASR is to add audio modality to RWKV7 model which means RWKV7 base model stays unaltered. The model trained a 0.1B rwkv model to convert whisper-large-v3 encoder's latents to RWKV7's latents space which convert the speech into texts according to the text instruction.
15
  This design keeps all abilities of LLM and is easy to add more functions to the model such as speech to speech, speech translation, etc. You name it!
16
+
17
+
18
+ # Usage
19
+ Inference sample code is:
20
+ https://github.com/yynil/RWKVTTS/blob/respark/model/test/test_asr_whisper.py
21
+ 1. Download whisper_large_v3, although we only need encoder part, it's still easy to load from the model directory. Supposely we store it to /home/yueyulin/models/whisper-large-v3/
22
+ 2.