zipvoice-vietnamese-english
> English below — Vietnamese translation included below ⬇️
🔊 Overview This repository contains a fine-tuned version of ZipVoice — a lightweight, efficient speech generation model developed by the K2-FSA team.
| Item | Description | |------|--------------| | Base Model | k2-fsa/ZipVoice | | Fine-tuned Dataset | `vietbud500`, `vlsp2020vinai100h` For Vietnamese, `libritts` for English | | Language | English / Vietnamese | | Framework | PyTorch | | License | Apache License 2.0 |
Follow the same installation and setup steps as the official ZipVoice repository.
Once the environment is ready, simply replace the pretrained checkpoint with the one provided in this repository.
```bash Follow all steps from the official ZipVoice repo ...
Then replace the checkpoint mkdir -p checkpoints cp /path/to/your/finetunedmodel.pt checkpoints/zipvoicebase.pt
Run inference python3 ./egs/ttsinference.py \ --config ./egs/configs/ttsinfer.yaml \ --checkpoint ./checkpoints/zipvoicebase.pt \ --text "Hello! This is the fine-tuned Vietnamese-English ZipVoice model." \ --output ./audio/tts1.wav --lang vi --tokenizer espeak