Parakeet TDT 0.6B V2

Downloads
Hugging Face
1.9M
1.3K
License
license:cc-by-4.0
Updated
11/3/2025
by
nvidia

Automatic speech recognition model using the Transducer architecture. It is built with the NeMo library and supports English language. The model is trained on datasets such as nvidia/Granary and nvidia/nemo-asr-set-3.0. It features tags like FastConformer, Conformer, and is part of the hf-asr-leaderboard. Example audio samples include Librispeech sample 1 and Librispeech sample 2.

Audio Model
OTHER
0.6B params

Quick Info

Released
4/15/2025
Framework
OTHER

Resources