AI21-Jamba2-Mini

Name: AI21-Jamba2-Mini
Author: ai21labs

348

license:apache-2.0

ai21labs

Language Model

OTHER

New

348 downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

Unknown

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Code Examples

Quickstarttextvllm

vllm serve "ai21labs/AI21-Jamba2-Mini" --mamba-ssm-cache-dtype float32 --enable-auto-tool-choice --tool-call-parser hermes --enable-prefix-caching --quantization experts_int8

Run with Transformerstext

pip install transformers>=4.54.0
pip install flash-attn --no-build-isolation
pip install causal-conv1d>=1.2.0
pip install mamba-ssm

Run with Transformerstexttransformers

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ai21labs/AI21-Jamba2-Mini",
                                  dtype=torch.bfloat16,
attn_implementation="flash_attention_2", device_map="auto")

tokenizer = AutoTokenizer.from_pretrained("ai21labs/AI21-Jamba2-Mini")

messages = [
    {"role": "system",
     "content": "You are an HR Policy Assistant.
                 Answer employee questions using only the provided policy documents.
                 If the answer isn't in the documents, say so clearly.
                 Be concise and cite the specific policy section when possible."
},
    {"role": "user",
     "content": "Context documents: {retrieved_chunks}.
                 Employee question: {user_question}.
                 Answer:"
},
]

prompts = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)

outputs = model.generate(**tokenizer(prompts, return_tensors="pt").to(model.device), do_sample=True, temperature=0.6)

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.