whisper-small
2.7M
478
1K
Small context
101 languages
license:apache-2.0
by
openai
Audio Model
OTHER
High
2.7M downloads
Battle-tested
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary
--- language: - en - zh - de - es - ru - ko - fr - ja - pt - tr - pl - ca - nl - ar - sv - it - id - hi - fi - vi - he - uk - el - ms - cs - ro - da - hu - ta -...
Code Examples
text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>text
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")python
model.config.forced_decoder_ids = WhisperProcessor.get_decoder_prompt_ids(language="english", task="transcribe")Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Transcriptionpythontransformers
>>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> # load model and processor
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
>>> model.config.forced_decoder_ids = None
>>> # load dummy dataset and read audio files
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt").input_features
>>> # generate token ids
>>> predicted_ids = model.generate(input_features)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
['<|startoftranscript|><|en|><|transcribe|><|notimestamps|> Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.<|endoftext|>']
>>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
[' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.']Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737Evaluationpythontransformers
>>> from datasets import load_dataset
>>> from transformers import WhisperForConditionalGeneration, WhisperProcessor
>>> import torch
>>> from evaluate import load
>>> librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
>>> processor = WhisperProcessor.from_pretrained("openai/whisper-small")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small").to("cuda")
>>> def map_to_pred(batch):
>>> audio = batch["audio"]
>>> input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
>>> batch["reference"] = processor.tokenizer._normalize(batch['text'])
>>>
>>> with torch.no_grad():
>>> predicted_ids = model.generate(input_features.to("cuda"))[0]
>>> transcription = processor.decode(predicted_ids)
>>> batch["prediction"] = processor.tokenizer._normalize(transcription)
>>> return batch
>>> result = librispeech_test_clean.map(map_to_pred)
>>> wer = load("wer")
>>> print(100 * wer.compute(references=result["reference"], predictions=result["prediction"]))
3.432213777886737pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]pythontransformers
>>> import torch
>>> from transformers import pipeline
>>> from datasets import load_dataset
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> pipe = pipeline(
>>> "automatic-speech-recognition",
>>> model="openai/whisper-small",
>>> chunk_length_s=30,
>>> device=device,
>>> )
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> sample = ds[0]["audio"]
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
>>> # we can also return timestamps for the predictions
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.