BELLE-2

8 models • 2 total models in database

Sort by:

Belle-whisper-large-v3-turbo-zh

Welcome If you find this model helpful, please like this model and star us on https://github.com/LianjiaTech/BELLE and https://github.com/shuaijiang/Whisper-Finetune Belle-whisper-large-v3-turbo-zh Fine tune whisper-large-v3-turbo-zh to enhance Chinese speech recognition capabilities, Belle-whisper-large-v3-turbo-zh demonstrates a 24-64% relative improvement in performance to whisper-large-v3-turbo on Chinese ASR benchmarks, including AISHELL1, AISHELL2, WENETSPEECH, and HKUST. Same to Belle-whisper-large-v3-zh-punct, the punctuation marks come from model puncct-transformercn-en-common-vocab471067-large, and are added to the training datasets. Fine-tuning | Model | (Re)Sample Rate | Train Datasets | Fine-tuning (full or peft) | |:----------------:|:-------:|:----------------------------------------------------------:|:-----------:| | Belle-whisper-large-v3-turbo-zh | 16KHz | AISHELL-1 AISHELL-2 WenetSpeech HKUST | full fine-tuning | If you want to fine-thuning the model on your datasets, please reference to the github repo CER(%) ↓ | Model | Language Tag | aishell1test(↓) |aishell2test(↓)| wenetspeechnet(↓) | wenetspeechmeeting(↓) | HKUSTdev(↓)| |:----------------:|:-------:|:-----------:|:-----------:|:--------:|:-----------:|:-------:| | whisper-large-v3 | Chinese | 8.085 | 5.475 | 11.72 | 20.15 | 28.597 | | whisper-large-v3-turbo | Chinese | 8.639 | 6.014 | 13.507 | 20.313 | 37.324 | | Belle-whisper-large-v3-turbo-zh | Chinese | 3.070 | 4.114 | 10.230 | 13.357 | 18.944 | It is worth mentioning that compared to whisper-large-v3 and whisper-large-v3-turbo, Belle-whisper-large-v3-turbo-zh has a significant improvement. Please cite our paper and github when using our code, data or model.

license:apache-2.0

1,223

Belle Whisper Large V3 Zh

Welcome If you find this model helpful, please like this model and star us on https://github.com/LianjiaTech/BELLE and https://github.com/shuaijiang/Whisper-Finetune Belle-whisper-large-v3-zh Fine tune whisper-large-v3 to enhance Chinese speech recognition capabilities, Belle-whisper-large-v3-zh demonstrates a 24-65% relative improvement in performance on Chinese ASR benchmarks, including AISHELL1, AISHELL2, WENETSPEECH, and HKUST. Fine-tuning | Model | (Re)Sample Rate | Train Datasets | Fine-tuning (full or peft) | |:----------------:|:-------:|:----------------------------------------------------------:|:-----------:| | Belle-whisper-large-v3-zh | 16KHz | AISHELL-1 AISHELL-2 WenetSpeech HKUST | full fine-tuning | If you want to fine-thuning the model on your datasets, please reference to the github repo CER(%) ↓ | Model | Language Tag | aishell1test(↓) |aishell2test(↓)| wenetspeechnet(↓) | wenetspeechmeeting(↓) | HKUSTdev(↓)| |:----------------:|:-------:|:-----------:|:-----------:|:--------:|:-----------:|:-------:| | whisper-large-v3 | Chinese | 8.085 | 5.475 | 11.72 | 20.15 | 28.597 | | Belle-whisper-large-v2-zh | Chinese | 2.549 | 3.746 | 8.503 | 14.598 | 16.289 | | Belle-whisper-large-v3-zh | Chinese | 2.781 | 3.786 | 8.865 | 11.246 | 16.440 | It is worth mentioning that compared to Belle-whisper-large-v2-zh, Belle-whisper-large-v3-zh has a significant improvement in complex acoustic scenes(such as wenetspeechmeeting). Please cite our paper and github when using our code, data or model.

BELLE-Llama2-13B-chat-0.4M

NaNK

llama

729

Belle-whisper-large-v3-zh-punct

Welcome If you find this model helpful, please like this model and star us on https://github.com/LianjiaTech/BELLE and https://github.com/shuaijiang/Whisper-Finetune Belle-whisper-large-v3-zh-punct Fine tune whisper-large-v3-zh to enhance Chinese punctuation mark capabilities while maintaining comparable performance, Belle-whisper-large-v3-zh-punct demonstrates similar performance to Belle-whisper-large-v3-zh on Chinese ASR benchmarks, including AISHELL1, AISHELL2, WENETSPEECH, and HKUST. The punctuation marks come from model puncct-transformercn-en-common-vocab471067-large, and are added to the training datasets. Fine-tuning | Model | (Re)Sample Rate | Train Datasets | Fine-tuning (full or peft) | |:----------------:|:-------:|:----------------------------------------------------------:|:-----------:| | Belle-whisper-large-v3-zh-punct | 16KHz | AISHELL-1 AISHELL-2 WenetSpeech HKUST | lora fine-tuning | To incorporate punctuation marks without compromising performance, Lora fine-tuning was employed. If you want to fine-thuning the model on your datasets, please reference to the github repo CER(%) ↓ | Model | Language Tag | aishell1test(↓) |aishell2test(↓)| wenetspeechnet(↓) | wenetspeechmeeting(↓) | HKUSTdev(↓)| |:----------------:|:-------:|:-----------:|:-----------:|:--------:|:-----------:|:-------:| | whisper-large-v3 | Chinese | 8.085 | 5.475 | 11.72 | 20.15 | 28.597 | | Belle-whisper-large-v3-zh | Chinese | 2.781 | 3.786 | 8.865 | 11.246 | 16.440 | | Belle-whisper-large-v3-zh-punct | Chinese | 2.945 | 3.808 | 8.998 | 10.973 | 17.196 | It is worth mentioning that compared to Belle-whisper-large-v3-zh, Belle-whisper-large-v3-zh-punct even has a slight improvement in complex acoustic scenes(such as wenetspeechmeeting). And the punctation marks of Belle-whisper-large-v3-zh-punct are removed to compute the CER. Please cite our paper and github when using our code, data or model.

license:apache-2.0

185

Belle-whisper-large-v2-zh

NaNK

license:apache-2.0

Belle-distilwhisper-large-v2-zh

Welcome If you find this model helpful, please like this model and star us on https://github.com/LianjiaTech/BELLE and https://github.com/shuaijiang/Whisper-Finetune Belle-distilwhisper-large-v2-zh Fine tune distilwhisper-large-v2 to enhance Chinese speech recognition capabilities. Similar to distilwhisper-large-v2, Belle-distilwhisper-large-v2-zh is 5.8 times faster and has 51% fewer parameters compared to whisper-large-v2. Despite having 51% fewer parameters, Belle-distilwhisper-large-v2-zh achieves a relative improvement of -3% to 35% over whisper-large-v2. It's important to note that the original distilwhisper-large-v2 cannot transcribe Chinese (it only outputs English). Fine-tuning | Model | (Re)Sample Rate | Train Datasets | Fine-tuning (full or peft) | |:----------------:|:-------:|:----------------------------------------------------------:|:-----------:| | Belle-distilwhisper-large-v2-zh | 16KHz | AISHELL-1 AISHELL-2 WenetSpeech HKUST | full fine-tuning | If you want to fine-thuning the model on your datasets, please reference to the github repo CER(%) ↓ | Model | Parameters(M) |Language Tag| aishell1test( ↓ ) |aishell2test( ↓ )| wenetspeechnet ( ↓ )| wenetspeechmeeting( ↓ )| HKUSTdev( ↓ )| |:----------------:|:-------:|:-------:|:-----------:|:-----------:|:--------:|:-----------:|:-------:| | whisper-large-v2 |1550 |Chinese | 8.818% | 6.183% | 12.343% | 26.413% | 31.917% | | distilwhisper-large-v2 |756| Chinese | - | - | - | - | - | | Belle-distilwhisper-large-v2-zh| 756 | Chinese | 5.958% | 6.477% | 12.786% | 17.039% | 20.771% | Please cite our paper and github when using our code, data or model.

license:apache-2.0

BELLE-VL

license:apache-2.0

Belle-whisper-large-v3-turbo-zh-ggml

license:apache-2.0