Qwen3.5-122B-A10B-REAP-20-GGUF

5.4K
6
llama-cpp
by
0xSero
Other
OTHER
122B params
New
5K downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
273GB+ RAM
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
114GB+ RAM

Training Data Analysis

🔵 Good (6.0/10)

Researched training datasets used by Qwen3.5-122B-A10B-REAP-20-GGUF with quality assessment

Specialized For

general
multilingual

Training Datasets (1)

c4
🔵 6/10
general
multilingual
Key Strengths
  • Scale and Accessibility: 750GB of publicly available, filtered text
  • Systematic Filtering: Documented heuristics enable reproducibility
  • Language Diversity: Despite English-only, captures diverse writing styles
Considerations
  • English-Only: Limits multilingual applications
  • Filtering Limitations: Offensive content and low-quality text remain despite filtering

Explore our comprehensive training dataset analysis

View All Datasets

Code Examples

How to Runbash
# Q4_K_M — fits in 64 GB, fastest
llama-server \
  -m Qwen3.5-122B-A10B-REAP-20-Q4_K_M.gguf \
  -ngl 999 --flash-attn on -c 4096 \
  --port 8080 --host 0.0.0.0

# With speculative decoding for faster generation
llama-server \
  -m Qwen3.5-122B-A10B-REAP-20-Q4_K_M.gguf \
  -ngl 999 --flash-attn on -c 4096 \
  --spec-type ngram-mod --spec-ngram-size-n 24 \
  --draft-min 48 --draft-max 64 \
  --port 8080 --host 0.0.0.0
Ollamabash
# Create a Modelfile
echo 'FROM ./Qwen3.5-122B-A10B-REAP-20-Q4_K_M.gguf' > Modelfile
ollama create reap20 -f Modelfile
ollama run reap20
Ollamapythonllama.cpp
from llama_cpp import Llama

llm = Llama(
    model_path="Qwen3.5-122B-A10B-REAP-20-Q4_K_M.gguf",
    n_gpu_layers=-1,
    n_ctx=4096,
    flash_attn=True,
)

output = llm.create_chat_completion(
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=512,
)
print(output["choices"][0]["message"]["content"])

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.