granite-4.0-1b-speech-ONNX

Name: granite-4.0-1b-speech-ONNX
Author: onnx-community

4.7K

license:apache-2.0

onnx-community

Audio Model

OTHER

1B params

New

5K downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

3GB+ RAM

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile

4-6GB RAM

Laptop

16GB RAM

Server

GPU

Minimum Recommended

1GB+ RAM

Code Examples

javascriptonnx

import {
  AutoProcessor,
  GraniteSpeechForConditionalGeneration,
  read_audio,
  TextStreamer,
} from "@huggingface/transformers";

const model_id = "onnx-community/granite-4.0-1b-speech-ONNX";
const processor = await AutoProcessor.from_pretrained(model_id);
const model = await GraniteSpeechForConditionalGeneration.from_pretrained(
  model_id,
  {
    dtype: {
      embed_tokens: "q4", // "fp32", "fp16", "q8"
      audio_encoder: "q4", // "fp32", "fp16", "q8", "q4", "q4f16"
      decoder_model_merged: "q4", // "q8", "q4", "q4f16"
    },
    device: "webgpu",
  },
);

const audio = await read_audio("http://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/mlk.wav", 16000);
const messages = [
  {
    role: "user",
    content: "<|audio|>can you transcribe the speech into a written format?",
  },
];
const text = processor.apply_chat_template(messages, {
  add_generation_prompt: false,
  tokenize: false,
});

const inputs = await processor(text, audio);
const generated_ids = await model.generate({
  ...inputs,
  max_new_tokens: 256,
  streamer: new TextStreamer(processor.tokenizer, {
    skip_prompt: true,
    // callback_function: (text) => { /* Do something with the streamed output */ },
  }),
});

const generated_texts = processor.batch_decode(
  generated_ids.slice(null, [inputs.input_ids.dims.at(-1), null]),
  { skip_special_tokens: true },
);
console.log(generated_texts[0]);

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.