granite-speech-4.0-1b-GGUF

85
license:apache-2.0
by
cstr
Audio Model
OTHER
1B params
New
85 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
3GB+ RAM
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
1GB+ RAM

Code Examples

Use with CrispASR (no Python at runtime)bash
# Build crispasr (one-time)
git clone https://github.com/CrispStrobe/CrispASR
cd CrispASR
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc) --target whisper-cli

# Auto-download the recommended quant on first use:
./build/bin/crispasr --backend granite -m auto -f my_audio.wav

# Or point at a local file:
./build/bin/crispasr --backend granite \
    -m granite-speech-4.0-1b-q4_k.gguf \
    -f my_audio.wav

# Speech translation to German via the runtime-tokenized prompt path:
./build/bin/crispasr --backend granite \
    -m granite-speech-4.0-1b-q4_k.gguf \
    -f my_audio.wav --translate -tl de
# → "und so meine amerikaner, fragen sie nicht, was ihr land für sie
#    tun kann, fragen sie, was sie für ihr land tun können."

# Word-level timestamps via the canary CTC aligner second pass:
./build/bin/crispasr --backend granite \
    -m granite-speech-4.0-1b-q4_k.gguf \
    -f my_audio.wav -am canary-ctc-aligner-q5_0.gguf -osrt -ml 1
Languagesbash
# 1. Download the base model from HF
hf download ibm-granite/granite-speech-4.0-1b --local-dir ./granite-speech-4.0-1b

# 2. Convert to F16 GGUF (the new converter writes both vocab and merges)
python models/convert-granite-speech-to-gguf.py \
    --input ./granite-speech-4.0-1b \
    --output granite-speech-4.0-1b-f16.gguf

# 3. Quantize
./build/bin/crispasr-quantize granite-speech-4.0-1b-f16.gguf granite-speech-4.0-1b-q8_0.gguf q8_0
./build/bin/crispasr-quantize granite-speech-4.0-1b-f16.gguf granite-speech-4.0-1b-q5_0.gguf q5_0
./build/bin/crispasr-quantize granite-speech-4.0-1b-f16.gguf granite-speech-4.0-1b-q4_k.gguf q4_k

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.