lex-au

21 models • 2 total models in database

Sort by:

Orpheus-3b-FT-Q4_K_M.gguf

This is a quantised version of canopylabs/orpheus-3b-0.1-ft. Orpheus is a high-performance Text-to-Speech model fine-tuned for natural, emotional speech synthesis. This repository hosts the 8-bit quantised version of the 3B parameter model, optimised for efficiency while maintaining high-quality output. Orpheus-3b-FT-Q4KM is a 3 billion parameter Text-to-Speech model that converts text inputs into natural-sounding speech with support for multiple voices and emotional expressions. The model has been quantised to 8-bit (Q4KM) format for efficient inference, making it accessible on consumer hardware. Key features: - 8 distinct voice options with different characteristics - Support for emotion tags like laughter, sighs, etc. - Optimised for CUDA acceleration on RTX GPUs - Produces high-quality 24kHz mono audio - Fine-tuned for conversational naturalness This model is designed to be used with an LLM inference server that connects to the Orpheus-FastAPI frontend, which provides both a web UI and OpenAI-compatible API endpoints. This quantised model can be loaded into any of these LLM inference servers: - GPUStack - GPU optimised LLM inference server (My pick) - supports LAN/WAN tensor split parallelisation - LM Studio - Load the GGUF model and start the local server - llama.cpp server - Run with the appropriate model parameters - Any compatible OpenAI API-compatible server 1. Download this quantised model from lex-au's Orpheus-FASTAPI collection 2. Load the model in your preferred inference server and start the server. 4. Configure the FastAPI server to connect to your inference server by setting the `ORPHEUSAPIURL` environment variable. 5. Follow the complete installation and setup instructions in the repository README. Listen to the model in action with different voices and emotions: The model supports 8 different voices: - `tara`: Female, conversational, clear - `leah`: Female, warm, gentle - `jess`: Female, energetic, youthful - `leo`: Male, authoritative, deep - `dan`: Male, friendly, casual - `mia`: Female, professional, articulate - `zac`: Male, enthusiastic, dynamic - `zoe`: Female, calm, soothing You can add expressiveness to speech by inserting tags: - ` `, ` `: For laughter sounds - ` `: For sighing sounds - ` `, ` `: For subtle interruptions - ` `, ` `, ` `: For additional emotional expression - Architecture: Specialised token-to-audio sequence model - Parameters: ~3 billion - Quantisation: 8-bit (GGUF Q4KM format) - Audio Sample Rate: 24kHz - Input: Text with optional voice selection and emotion tags - Output: High-quality WAV audio - Language: English - Hardware Requirements: CUDA-compatible GPU (recommended: RTX series) - Integration Method: External LLM inference server + Orpheus-FastAPI frontend - Currently supports English text only - Best performance achieved on CUDA-compatible GPUs - Generation speed depends on GPU capability This model is available under the Apache License 2.0. The original Orpheus model was created by Canopy Labs. This repository contains a quantised version optimised for use with the Orpheus-FastAPI server. If you use this quantised model in your research or applications, please cite:

lex-au

Orpheus-3b-FT-Q4_K_M.gguf

Orpheus-3b-FT-Q8_0.gguf

Orpheus-3b-German-FT-Q8_0.gguf

Orpheus-3b-FT-Q2_K.gguf

Orpheus-3b-Italian_Spanish-FT-Q8_0.gguf

Orpheus 3b French FT Q8 0.Gguf

Orpheus-3b-Korean-FT-Q8_0.gguf

Orpheus-3b-Chinese-FT-Q8_0.gguf

Google.Gemma-3-4b-it-GGUF

Vocalis-Q4_K_M.gguf

Orpheus-3b-Hindi-FT-Q8_0.gguf

Orpheus-3b-Kaya-Q2_K.gguf

Orpheus-3b-Kaya-Q8_0.gguf

Orpheus-3b-Kaya-FP16.gguf

Google.Gemma-3-27b-pt-GGUF

Google.Gemma-3-1b-it-GGUF

Orpheus-3b-Kaya-Q4_K_M.gguf

Orpheus-3b-Kaya-Q6_K.gguf

shuttle-3.5-Q8_0-GGUF

Vocalis-FP16.gguf

Vocalis-Q8_0.gguf