LLMYourWay
ModelsDevices
Edge AI
CompareInsights
Enterprise

jordand

3 models • 1 total models in database
Sort by:

whisper-d-v1a

WhisperD is a fine-tuned version of whisper-large-v2 that is able to transcribe multi-speaker, conversational speech. It was used to generate synthetic transcriptions for training Parakeet. Diarization is performed implicitly by the model, where "[S1]", "[S2]", etc. denote speaker identity. WhisperD is (often) able to transcribe non-speech events, e.g. "(coughs)", "(laughs)". Outputs include disfluencies. More details can be found in the WhisperD blog post. Caution: This model has only been tested on segments up to 30 seconds in length. It may be unable to handle conditioning on previous text, as this was not included during fine-tuning. Thus, if a pipeline / codebase uses this feature in order to transcribe audio with duration over 30 seconds, generation quality may be poor.

license:mit
345
36

fish-s1-dac-min

license:cc-by-nc-sa-4.0
5
2

echo-tts-no-speaker

license:cc-by-nc-sa-4.0
2
3
LLMYourWay

The definitive AI model comparison platform. Compare 12K+ models, track performance, and discover the perfect AI solution for your needs.

Made with AI
Real-time Data

Product

  • Find Your Device
  • Browse Models
  • Compare AI
  • Benchmarks
  • Pricing
  • API Access

Resources

  • Blog & Articles
  • Methodology
  • Changelog
  • Trending
  • Use Cases

Company

  • About
  • Contact

Legal

  • Privacy Policy
  • Cookie Policy
  • Terms of Service
12K+12,000+
AI Models Tracked & Updated Daily
© 2026 LLMYourWay. All rights reserved.
Data updated every 4 hours
Powered by real-time AI data
API