A 10.4M paramater generative audio model for restoring degraded vocals in any situation that runs 10.5x faster than real-time on iPhone 12's CPU. Outperforms all open source models in subjective quality; matches commericial models on singing voice restoration.
Technical Report: [](https://arxiv.org/abs/2510.21659)
Extreme Degradation Bench: [](https://huggingface.co/datasets/smulelabs/ExtremeDegradationBench)