eryx-swahili-tts-v1

Name: eryx-swahili-tts-v1
Author: Engeryx

license:apache-2.0

Engeryx

Audio Model

OTHER

2B params

New

0 downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

5GB+ RAM

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile

4-6GB RAM

Laptop

16GB RAM

Server

GPU

Minimum Recommended

2GB+ RAM

Code Examples

Usagepythonpytorch

import torch
from TTS.tts.configs.xtts_config import XttsConfig
from TTS.tts.models.xtts import Xtts
from huggingface_hub import hf_hub_download

# Download speaker embeddings
embedding_path = hf_hub_download(
    repo_id="EryxLabs/eryx-swahili-tts-v1",
    filename="swahili_speaker.pt"
)

# Load XTTS-v2 model
model_path = "path/to/xtts_v2"  # or download from coqui
config = XttsConfig()
config.load_json(f"{model_path}/config.json")
model = Xtts.init_from_config(config)
model.load_checkpoint(config, checkpoint_dir=model_path, eval=True)

# Load Swahili speaker embeddings
embeddings = torch.load(embedding_path)
gpt_cond_latent = embeddings['gpt_cond_latent']
speaker_embedding = embeddings['speaker_embedding']

# Synthesize Swahili text
# Note: Use 'en' for language since XTTS-v2 doesn't support 'sw' directly
out = model.inference(
    text="Habari yako, mimi ni msaidizi wa Kiswahili.",
    language="en",  # Swahili uses Latin script, works with English tokenizer
    gpt_cond_latent=gpt_cond_latent,
    speaker_embedding=speaker_embedding,
)

# Save audio
import torchaudio
torchaudio.save("output.wav", torch.tensor(out["wav"]).unsqueeze(0), 24000)

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.