granite-speech-4.0-1b-GGUF
85
license:apache-2.0
by
cstr
Audio Model
OTHER
1B params
New
85 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
3GB+ RAM
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
1GB+ RAM
Code Examples
Use with CrispASR (no Python at runtime)bash
# Build crispasr (one-time)
git clone https://github.com/CrispStrobe/CrispASR
cd CrispASR
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc) --target whisper-cli
# Auto-download the recommended quant on first use:
./build/bin/crispasr --backend granite -m auto -f my_audio.wav
# Or point at a local file:
./build/bin/crispasr --backend granite \
-m granite-speech-4.0-1b-q4_k.gguf \
-f my_audio.wav
# Speech translation to German via the runtime-tokenized prompt path:
./build/bin/crispasr --backend granite \
-m granite-speech-4.0-1b-q4_k.gguf \
-f my_audio.wav --translate -tl de
# → "und so meine amerikaner, fragen sie nicht, was ihr land für sie
# tun kann, fragen sie, was sie für ihr land tun können."
# Word-level timestamps via the canary CTC aligner second pass:
./build/bin/crispasr --backend granite \
-m granite-speech-4.0-1b-q4_k.gguf \
-f my_audio.wav -am canary-ctc-aligner-q5_0.gguf -osrt -ml 1Languagesbash
# 1. Download the base model from HF
hf download ibm-granite/granite-speech-4.0-1b --local-dir ./granite-speech-4.0-1b
# 2. Convert to F16 GGUF (the new converter writes both vocab and merges)
python models/convert-granite-speech-to-gguf.py \
--input ./granite-speech-4.0-1b \
--output granite-speech-4.0-1b-f16.gguf
# 3. Quantize
./build/bin/crispasr-quantize granite-speech-4.0-1b-f16.gguf granite-speech-4.0-1b-q8_0.gguf q8_0
./build/bin/crispasr-quantize granite-speech-4.0-1b-f16.gguf granite-speech-4.0-1b-q5_0.gguf q5_0
./build/bin/crispasr-quantize granite-speech-4.0-1b-f16.gguf granite-speech-4.0-1b-q4_k.gguf q4_kDeploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.