sarvam-105b-FP8-Dynamic

Name: sarvam-105b-FP8-Dynamic
Author: inference-optimization

—

inference-optimization

Language Model

OTHER

105B params

New

14 downloads

Early-stage

Edge AI:

Mobile

Laptop

Server

235GB+ RAM

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Mobile

4-6GB RAM

Laptop

16GB RAM

Server

GPU

Minimum Recommended

98GB+ RAM

Deploymenttextvllm

uv pip install -U git+https://github.com/vllm-project/vllm.git \
  --extra-index-url https://wheels.vllm.ai/nightly \
  --no-deps \
  --no-cache

textvllm

uv pip install git+https://github.com/vllm-project/llm-compressor.git
uv pip install --upgrade torchvision --break-system-packages --no-cache

Production-ready deployment in minutes

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.