CuriousMonkey7

1 models • 1 total models in database

Sort by:

HumAware-VAD

WORK IN PROGRESS [WIP]HumAware-VAD: Humming-Aware Voice Activity Detection 📌 Overview HumAware-VAD is a fine-tuned version of the Silero-VAD model, trained to distinguish humming from actual speech. Standard Voice Activity Detection (VAD) models, including Silero-VAD, often misclassify humming as speech, leading to inaccurate speech segmentation. HumAware-VAD improves upon this by leveraging a custom dataset (HumSpeechBlend) to enhance speech detection accuracy in the presence of humming. 🎯 Purpose The primary goal of HumAware-VAD is to: - Reduce false positives where humming is mistakenly detected as speech. - Enhance speech segmentation accuracy in real-world applications. - Improve VAD performance for tasks involving music, background noise, and vocal sounds. 🗂️ Model Details - Base Model: Silero-VAD - Fine-tuning Dataset: HumSpeechBlend - Format: JIT (TorchScript) - Framework: PyTorch - Inference Speed: Real-time 📄 Citation If you use this model, please cite it accordingly.

license:mit

1,266