LLMYourWay
ModelsDevices
Edge AI
CompareInsights
Enterprise

gpt-omni

2 models • 1 total models in database
Sort by:

mini-omni2

license:mit
53
277

Mini Omni

Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Mini-Omni is an open-source multimodel large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities. ✅ Real-time speech-to-speech conversational capabilities. No extra ASR or TTS models required. ✅ Talking while thinking, with the ability to generate text and audio at the same time. ✅ With "Audio-to-Text" and "Audio-to-Audio" batch inference to further boost the performance. NOTE: please refer to the code repository for more details. Create a new conda environment and install the required packages: NOTE: you need to run streamlit locally with PyAudio installed. NOTE: need to unmute first. Gradio seems can not play audio stream instantly, so the latency feels a bit longer. - Qwen2 as the LLM backbone. - litGPT for training and inference. - whisper for audio encoding. - snac for audio decoding. - CosyVoice for generating synthetic speech. - OpenOrca and MOSS for alignment.

NaNK
license:mit
0
430
LLMYourWay

The definitive AI model comparison platform. Compare 12K+ models, track performance, and discover the perfect AI solution for your needs.

Made with AI
Real-time Data

Product

  • Find Your Device
  • Browse Models
  • Compare AI
  • Benchmarks
  • Pricing
  • API Access

Resources

  • Blog & Articles
  • Methodology
  • Changelog
  • Trending
  • Use Cases

Company

  • About
  • Contact

Legal

  • Privacy Policy
  • Cookie Policy
  • Terms of Service
12K+12,000+
AI Models Tracked & Updated Daily
© 2026 LLMYourWay. All rights reserved.
Data updated every 4 hours
Powered by real-time AI data
API