Qwen3.5-27B-AWQ

Name: Qwen3.5-27B-AWQ
Author: QuantTrio

15.2K

license:apache-2.0

QuantTrio

Image Model

OTHER

27B params

Fair

15K downloads

Community-tested

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

61GB+ RAM

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile

4-6GB RAM

Laptop

16GB RAM

Server

GPU

Minimum Recommended

26GB+ RAM

Code Examples

Note: important for old containers, or could face run time error likebashvllm

pip install -U vllm --pre --index-url https://pypi.org/simple --extra-index-url https://wheels.vllm.ai/nightly

# Note: important for old containers, or could face run time error like
# subprocess.CalledProcessError: 
# Command '['ninja', '-v', '-C', 
# '/root/.cache/flashinfer/0.6.3/90a/cached_ops/trtllm_comm',
# '-f', '/root/.cache/flashinfer/0.6.3/90a/cached_ops/trtllm_comm/build.ninja']'
# returned non-zero exit status 1. 
rm -rf ~/.cache/flashinfer

# upgrade transformers so that applications could properly execute tool calls
pip install -U "transformers @ git+https://github.com/huggingface/transformers.git@f2ba019"
# locate modeling_rope_utils.py line 651 to fix a simple bug
TF_FILE="$(python -m pip show transformers | awk -F': ' '/^Location:/{print $2}')/transformers/modeling_rope_utils.py" && echo "$TF_FILE"
NEW_LINE='            ignore_keys_at_rope_validation = set(ignore_keys_at_rope_validation) | {"partial_rotary_factor"}' \
perl -i.bak -pe 'if ($. == 651) { $_ = $ENV{NEW_LINE} . "\n" }' "$TF_FILE"

bash

uv pip install 'git+https://github.com/sgl-project/sglang.git#subdirectory=python&egg=sglang[all]'

bashvllm

uv pip install vllm --torch-backend=auto --extra-index-url https://wheels.vllm.ai/nightly

bash

pip install "transformers[serving] @ git+https://github.com/huggingface/transformers.git@main"

bash

transformers serve --force-model Qwen/Qwen3.5-27B --port 8000 --continuous-batching

Set the following accordinglybash

pip install -U openai

# Set the following accordingly
export OPENAI_BASE_URL="http://localhost:8000/v1"
export OPENAI_API_KEY="EMPTY"

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.