AI21-Jamba2-Mini

348
45
license:apache-2.0
by
ai21labs
Language Model
OTHER
New
348 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Code Examples

Quickstarttextvllm
vllm serve "ai21labs/AI21-Jamba2-Mini" --mamba-ssm-cache-dtype float32 --enable-auto-tool-choice --tool-call-parser hermes --enable-prefix-caching --quantization experts_int8
Run with Transformerstext
pip install transformers>=4.54.0
pip install flash-attn --no-build-isolation
pip install causal-conv1d>=1.2.0
pip install mamba-ssm
Run with Transformerstexttransformers
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ai21labs/AI21-Jamba2-Mini",
                                  dtype=torch.bfloat16,
attn_implementation="flash_attention_2", device_map="auto")

tokenizer = AutoTokenizer.from_pretrained("ai21labs/AI21-Jamba2-Mini")

messages = [
    {"role": "system",
     "content": "You are an HR Policy Assistant.
                 Answer employee questions using only the provided policy documents.
                 If the answer isn't in the documents, say so clearly.
                 Be concise and cite the specific policy section when possible."
},
    {"role": "user",
     "content": "Context documents: {retrieved_chunks}.
                 Employee question: {user_question}.
                 Answer:"
},
]

prompts = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)

outputs = model.generate(**tokenizer(prompts, return_tensors="pt").to(model.device), do_sample=True, temperature=0.6)

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.