r-f
wav2vec-english-speech-emotion-recognition
Speech Emotion Recognition By Fine-Tuning Wav2Vec 2.0 The model is a fine-tuned version of jonatasgrosman/wav2vec2-large-xlsr-53-english for a Speech Emotion Recognition (SER) task. Several datasets were used the fine-tune the original model: - Surrey Audio-Visual Expressed Emotion (SAVEE) - 480 audio files from 4 male actors - Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) - 1440 audio files from 24 professional actors (12 female, 12 male) - Toronto emotional speech set (TESS) - 2800 audio files from 2 female actors 7 labels/emotions were used as classification labels It achieves the following results on the evaluation set: - Loss: 0.104075 - Accuracy: 0.97463 Training procedure Training hyperparameters The following hyperparameters were used during training: - learningrate: 0.0001 - trainbatchsize: 4 - evalbatchsize: 4 - evalsteps: 500 - seed: 42 - gradientaccumulationsteps: 2 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - numepochs: 4 - maxsteps=7500 - savesteps: 1500 Training results | Step | Training Loss | Validation Loss | Accuracy | | ---- | ------------- | --------------- | -------- | | 500 | 1.8124 | 1.365212 | 0.486258 | | 1000 | 0.8872 | 0.773145 | 0.79704 | | 1500 | 0.7035 | 0.574954 | 0.852008 | | 2000 | 0.6879 | 1.286738 | 0.775899 | | 2500 | 0.6498 | 0.697455 | 0.832981 | | 3000 | 0.5696 | 0.33724 | 0.892178 | | 3500 | 0.4218 | 0.307072 | 0.911205 | | 4000 | 0.3088 | 0.374443 | 0.930233 | | 4500 | 0.2688 | 0.260444 | 0.936575 | | 5000 | 0.2973 | 0.302985 | 0.92389 | | 5500 | 0.1765 | 0.165439 | 0.961945 | | 6000 | 0.1475 | 0.170199 | 0.961945 | | 6500 | 0.1274 | 0.15531 | 0.966173 | | 7000 | 0.0699 | 0.103882 | 0.976744 | | 7500 | 0.083 | 0.104075 | 0.97463 |
ModernBERT-large-zeroshot-v1
This model is a fine-tuned ModernBERT-large for Natural Language Inference. It was trained on the MoritzLaurer/syntheticzeroshotmixtralv0.1 and is designed to carry out zero-shot classification. - Model Type: ModernBERT-large (BERT variant) - Task: Zero-shot Classification - Languages: English - Dataset: MoritzLaurer/syntheticzeroshotmixtralv0.1 - Fine-Tuning: Fine-tuned for Zero-shot Classification - Training Loss: Measures the model's fit to the training data. - Validation Loss: Measures the model's generalization to unseen data. - Accuracy: The percentage of correct predictions over all examples. - F1 Score: A balanced metric between precision and recall. - Model Name: ModernBERT-large-zeroshot-v1 - Hugging Face Repo: r-f/ModernBERT-large-zeroshot-v1 - License: MIT (or another applicable license) - Date: 23-12-2024 - Model: ModernBERT (Large variant) - Framework: PyTorch - Batch Size: 32 - Learning Rate: 2e-5 - Optimizer: AdamW - Hardware: RTX 4090 - The model was trained on the MoritzLaurer/syntheticzeroshotmixtralv0.1. And the training script was adapted from MoritzLaurer/zeroshot-classifier - Special thanks to the Hugging Face community and all contributors to the transformers library. This model is licensed under the MIT License. See the LICENSE file for more details.