turboderp
Cat-Llama-3-70B-instruct
Llama-3-8B-Instruct-exl2
gemma-2-9b-it-exl2
GLM 4.6 Exl3 2.33bpw Opt
This is just a quick mix of the 2.25 bpw quant with attention, dense layers and shared experts in 4.0 bpw.
Devstral-2-123B-Instruct-2512-exl3
Mistral-Small-3.1-24B-Instruct-2503-exl3
GLM-4.5-Air-exl3
Mistral-Large-Instruct-2411-exl3
Qwen3-Next-80B-A3B-Instruct-exl3
⚠️ Requires ExLlamaV3 v0.0.7 (or v0.0.6 `dev` branch) 2.00 bits per weight 3.00 bits per weight 4.00 bits per weight 5.00 bits per weight 2.08 bits per weight 2.27 bits per weight 2.78 bits per weight 3.14 bits per weight 3.53 bits per weight 4.06 bits per weight 4.51 bits per weight
Qwen3-Next-80B-A3B-Thinking-exl3
⚠️ Requires ExLlamaV3 v0.0.7 (or v0.0.6 `dev` branch) 2.00 bits per weight 3.00 bits per weight 4.00 bits per weight 5.00 bits per weight 6.00 bits per weight 2.08 bits per weight 2.27 bits per weight 2.78 bits per weight 3.14 bits per weight 3.53 bits per weight 4.06 bits per weight 4.51 bits per weight
Qwama-0.5B-Instruct
EXAONE-4.0-32B-exl3
2.00 bits per weight 2.50 bits per weight 3.00 bits per weight 3.50 bits per weight 4.00 bits per weight 5.00 bits per weight 6.00 bits per weight 8.00 bits per weight / H8
Qwen3-8B-exl3
ERNIE-4.5-300B-A47B-Base-PT-exl3
gemma-2-27b-it-exl2
gemma-3-27b-it-exl3
Qwen3-30B-A3B-exl3
ERNIE-4.5-300B-A47B-PT-exl3
Mistral-Large-Instruct-2407-123B-exl2
SmolLM3-3B-exl3
2.00 bits per weight 2.50 bits per weight 3.00 bits per weight 3.50 bits per weight 4.00 bits per weight 5.00 bits per weight 6.00 bits per weight 8.00 bits per weight / H8
Qwen3-32B-exl3
Llama-3.1-70B-Instruct-exl2
Llama-3.1-8B-Instruct-exl2
Grok-3-reasoning-gemma3-12B-distilled-HF-exl3
EXL3 quants of Grok-3-reasoning-gemma3-12B-distilled-HF
Llama-3.1-70B-Instruct-exl3
MiniMax-M2-exl3
⚠️ Requires ExLlamaV3 v0.0.12 (or v0.0.11 `dev` branch) 2.00 bits per weight 3.00 bits per weight 4.00 bits per weight 2.04 bits per weight 2.27 bits per weight 3.04 bits per weight 3.50 bits per weight 4.03 bits per weight . | KL-div | ppl | HumanEval@1 ---------|--------|-------|------------- 2.00 bpw | 0.400 | 10.92 | 80.5% 2.04 bpw | 0.297 | 10.23 | 87.1% 2.27 bpw | 0.252 | 9.78 | 88.4% 3.00 bpw | 0.141 | 8.99 | 87.8% 3.04 bpw | 0.117 | 8.73 | 87.2% 3.50 bpw | 0.094 | 8.78 | 88.4% 4.00 bpw | 0.087 | 8.58 | 89.6% 4.03 bpw | 0.077 | 8.61 | 87.8% original | - | 8.51 | 87.2%¹
Apertus-70B-Instruct-2509-exl3
2.00 bits per weight 2.50 bits per weight 3.00 bits per weight 3.50 bits per weight 4.00 bits per weight 5.00 bits per weight 6.00 bits per weight . | MMLU | 95% CI ----------|--------------|------------ 2.0 bpw | 58.90% | +/- 1.50% 2.5 bpw | 64.20% | +/- 1.46% 3.0 bpw | 67.00% | +/- 1.43% 3.5 bpw | 67.70% | +/- 1.43% 4.0 bpw | 69.40% | +/- 1.40% 5.0 bpw | 70.30% | +/- 1.39% 6.0 bpw | 69.60% | +/- 1.40%
Qwen2.5-VL-7B-Instruct-exl2
c4ai-command-r7b-12-2024-exl3
2.00 bits per weight 2.50 bits per weight 3.00 bits per weight 4.00 bits per weight 5.00 bits per weight 6.00 bits per weight 8.00 bits per weight / H8
c4ai-command-r-08-2024-exl3
2.00 bits per weight 2.50 bits per weight 3.00 bits per weight 4.00 bits per weight 5.00 bits per weight 6.00 bits per weight 8.00 bits per weight / H8
Qwen3-VL-235B-A22B-Thinking-exl3
Mistral-7B-Instruct-v0.3-exl3
Mistral-Nemo-Instruct-12B-exl2
GLM-4.6V-exl3
command-r-plus-103B-exl2
Mixtral-8x7B-exl2
Llama-3.1-8B-Instruct-exl3
Llama-3.2-1B-Instruct-exl3
Qwen2.5-7B-Instruct-exl3
gemma-3-27b-it-exl2
Mistral-7B-instruct-v0.3-exl2
CodeLlama-34B-instruct-exl2
Apertus-8B-Instruct-2509-exl3
2.00 bits per weight 2.50 bits per weight 3.00 bits per weight 3.50 bits per weight 4.00 bits per weight 5.00 bits per weight 6.00 bits per weight 8.00 bits per weight / H8
dbrx-instruct-exl2
Llama-3.3-Nemotron-Super-49B-v1-exl3
Llama2-7B-chat-exl2
Llama-3.2-3B-Instruct-exl2
Qwen3-0.6B-exl3
Mixtral-8x7B-instruct-exl2
turbcat-instruct-72b
command-r-v01-35B-exl2
Llama-3-70B-exl2
gemma-3-12b-it-exl2
Qwama-0.5B-Instruct-exl2
Phi-3-mini-128k-instruct-exl2
turbcat-instruct-72b-exl2
Qwen3-235B-A22B-exl3
2.00 bits per weight 2.25 bits per weight 2.50 bits per weight 3.00 bits per weight
Mistral-7B-instruct-exl2
gemma-4-26B-A4B-it-exl3
gemma-4-26B-A4B-exl3
llama3-turbcat-instruct-8b-exl2
Mistral-Nemo-Base-12B-exl2
dots.llm1.inst-exl3
CodeLlama-13B-instruct-exl2
Qwen2.5-14B-Instruct-exl3
c4ai-command-r-plus-08-2024-exl3
2.07 bits per weight 2.50 bits per weight 3.00 bits per weight 4.00 bits per weight 5.00 bits per weight