anuragshas
wav2vec2-large-xlsr-53-telugu
Wav2Vec2-Large-XLSR-53-Telugu Fine-tuned facebook/wav2vec2-large-xlsr-53 on Telugu using the OpenSLR SLR66 dataset. When using this model, make sure that your speech input is sampled at 16kHz. Usage The model can be used directly (without a language model) as follows: Test Result: 44.98% Training 70% of the OpenSLR Telugu dataset was used for training. Training Data Preparation notebook can be found here
whisper-large-v2-bn
whisper-small-bn
wav2vec2-large-xls-r-300m-bg
Whisper Large V2 Ka
This model is a fine-tuned version of openai/whisper-large-v2 on the mozilla-foundation/commonvoice110 ka dataset. It achieves the following results on the evaluation set: - Loss: 0.1187 - Wer: 31.8548 The following hyperparameters were used during training: - learningrate: 1e-05 - trainbatchsize: 32 - evalbatchsize: 16 - seed: 42 - distributedtype: multi-GPU - gradientaccumulationsteps: 2 - totaltrainbatchsize: 64 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lrschedulertype: linear - lrschedulerwarmupsteps: 100 - trainingsteps: 1000 | Training Loss | Epoch | Step | Validation Loss | Wer | |:-------------:|:-----:|:----:|:---------------:|:-------:| | 0.0413 | 2.06 | 200 | 0.0712 | 36.6296 | | 0.006 | 5.04 | 400 | 0.0899 | 33.7467 | | 0.0008 | 8.02 | 600 | 0.1039 | 32.2311 | | 0.0002 | 11.01 | 800 | 0.1141 | 31.9290 | | 0.0001 | 13.06 | 1000 | 0.1187 | 31.8548 | - Transformers 4.26.0.dev0 - Pytorch 1.13.0+cu117 - Datasets 2.7.1.dev0 - Tokenizers 0.13.2