Qwen3.5-27B-AWQ

15.2K
5
license:apache-2.0
by
QuantTrio
Image Model
OTHER
27B params
Fair
15K downloads
Community-tested
Edge AI:
Mobile
Laptop
Server
61GB+ RAM
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
26GB+ RAM

Code Examples

Note: important for old containers, or could face run time error likebashvllm
pip install -U vllm --pre --index-url https://pypi.org/simple --extra-index-url https://wheels.vllm.ai/nightly

# Note: important for old containers, or could face run time error like
# subprocess.CalledProcessError: 
# Command '['ninja', '-v', '-C', 
# '/root/.cache/flashinfer/0.6.3/90a/cached_ops/trtllm_comm',
# '-f', '/root/.cache/flashinfer/0.6.3/90a/cached_ops/trtllm_comm/build.ninja']'
# returned non-zero exit status 1. 
rm -rf ~/.cache/flashinfer

# upgrade transformers so that applications could properly execute tool calls
pip install -U "transformers @ git+https://github.com/huggingface/transformers.git@f2ba019"
# locate modeling_rope_utils.py line 651 to fix a simple bug
TF_FILE="$(python -m pip show transformers | awk -F': ' '/^Location:/{print $2}')/transformers/modeling_rope_utils.py" && echo "$TF_FILE"
NEW_LINE='            ignore_keys_at_rope_validation = set(ignore_keys_at_rope_validation) | {"partial_rotary_factor"}' \
perl -i.bak -pe 'if ($. == 651) { $_ = $ENV{NEW_LINE} . "\n" }' "$TF_FILE"
bash
uv pip install 'git+https://github.com/sgl-project/sglang.git#subdirectory=python&egg=sglang[all]'
bashvllm
uv pip install vllm --torch-backend=auto --extra-index-url https://wheels.vllm.ai/nightly
bash
pip install "transformers[serving] @ git+https://github.com/huggingface/transformers.git@main"
bash
transformers serve --force-model Qwen/Qwen3.5-27B --port 8000 --continuous-batching
Set the following accordinglybash
pip install -U openai

# Set the following accordingly
export OPENAI_BASE_URL="http://localhost:8000/v1"
export OPENAI_API_KEY="EMPTY"

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.