Qwen3-Next-80B-A3B-Instruct-int4-AutoRound

194
9
80.0B
license:apache-2.0
by
Intel
Language Model
OTHER
80B params
New
194 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
179GB+ RAM
Mobile
Laptop
Server
Quick Summary

This model is a int4 model with groupsize 128 and symmetric quantization of Qwen/Qwen3-Next-80B-A3B-Instruct generated by intel/auto-round.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
75GB+ RAM

Code Examples

"content":bash
curl -noproxy '*' http://localhost::8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "messages": [
        {"role": "user", "content": "Give me a short introduction to large language model."}
        ],
        "max_tokens": 1024
    }'

# "content":
#    "A large language model (LLM) is a type of artificial intelligence system trained on vast amounts of text data to understand, generate, and manipulate human language. These models use deep learning architectures—often based on the transformer network—to predict the next word in a sequence, enabling them to perform tasks like answering questions, writing essays, translating languages, and even coding. LLMs, such as GPT, Gemini, and Claude, learn patterns and relationships in language without explicit programming, allowing them to produce human-like responses across a wide range of topics. While powerful, they don’t “understand” language in the human sense and can sometimes generate plausible-sounding but incorrect or biased information.",

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.