Qwen3.5-122B-A10B-REAP-20-GGUF

Name: Qwen3.5-122B-A10B-REAP-20-GGUF
Author: 0xSero

5.4K

llama-cpp

0xSero

Other

OTHER

122B params

New

5K downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

273GB+ RAM

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile

4-6GB RAM

Laptop

16GB RAM

Server

GPU

Minimum Recommended

114GB+ RAM

Training Data Analysis

🔵 Good (6.0/10)

Researched training datasets used by Qwen3.5-122B-A10B-REAP-20-GGUF with quality assessment

Specialized For

general

multilingual

Training Datasets (1)

🔵 6/10

general

multilingual

Key Strengths

•Scale and Accessibility: 750GB of publicly available, filtered text
•Systematic Filtering: Documented heuristics enable reproducibility
•Language Diversity: Despite English-only, captures diverse writing styles

Considerations

•English-Only: Limits multilingual applications
•Filtering Limitations: Offensive content and low-quality text remain despite filtering

Explore our comprehensive training dataset analysis

View All Datasets

Code Examples

How to Runbash

# Q4_K_M — fits in 64 GB, fastest
llama-server \
  -m Qwen3.5-122B-A10B-REAP-20-Q4_K_M.gguf \
  -ngl 999 --flash-attn on -c 4096 \
  --port 8080 --host 0.0.0.0

# With speculative decoding for faster generation
llama-server \
  -m Qwen3.5-122B-A10B-REAP-20-Q4_K_M.gguf \
  -ngl 999 --flash-attn on -c 4096 \
  --spec-type ngram-mod --spec-ngram-size-n 24 \
  --draft-min 48 --draft-max 64 \
  --port 8080 --host 0.0.0.0

Ollamabash

# Create a Modelfile
echo 'FROM ./Qwen3.5-122B-A10B-REAP-20-Q4_K_M.gguf' > Modelfile
ollama create reap20 -f Modelfile
ollama run reap20

Ollamapythonllama.cpp

from llama_cpp import Llama

llm = Llama(
    model_path="Qwen3.5-122B-A10B-REAP-20-Q4_K_M.gguf",
    n_gpu_layers=-1,
    n_ctx=4096,
    flash_attn=True,
)

output = llm.create_chat_completion(
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=512,
)
print(output["choices"][0]["message"]["content"])

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.