mradermacher
Crow-9B-Opus-4.6-Distill-Heretic_Qwen3.5-GGUF
DeepSeek-V2-Lite-GGUF
Lumimaid-v0.2-8B-Heretic-i1-GGUF
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-heretic-i1-GGUF
MN-12B-Mag-Mell-R1-GGUF
static quants of https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1 weighted/imatrix quants are available at https://huggingface.co/mradermacher/MN-12B-Mag-Mell-R1-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q2K | 4.9 | | | GGUF | IQ3XS | 5.4 | | | GGUF | Q3KS | 5.6 | | | GGUF | IQ3S | 5.7 | beats Q3K | | GGUF | IQ3M | 5.8 | | | GGUF | Q3KM | 6.2 | lower quality | | GGUF | Q3KL | 6.7 | | | GGUF | IQ4XS | 6.9 | | | GGUF | Q4KS | 7.2 | fast, recommended | | GGUF | Q4KM | 7.6 | fast, recommended | | GGUF | Q5KS | 8.6 | | | GGUF | Q5KM | 8.8 | | | GGUF | Q6K | 10.2 | very good quality | | GGUF | Q80 | 13.1 | fast, best quality | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
GPT-OSS-Swallow-120B-RL-v0.1-i1-GGUF
OpenAI-gpt-oss-20B-INSTRUCT-Heretic-Uncensored-MXFP4-i1-GGUF
MARTIN-9B-i1-GGUF
GLM-4.7-Flash-ultra-heretic-i1-GGUF
Dirty-Muse-Writer-v01-Uncensored-Erotica-NSFW-i1-GGUF
Apertus-70B-Instruct-2509-heretic-v2-i1-GGUF
DeepSeek-R1-Distill-Qwen-7B-Uncensored-i1-GGUF
Llama-3.1-8B-Instruct-heretic-i1-GGUF
weighted/imatrix quants of https://huggingface.co/p-e-w/Llama-3.1-8B-Instruct-heretic For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/Llama-3.1-8B-Instruct-heretic-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.1 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 2.1 | for the desperate | | GGUF | i1-IQ1M | 2.3 | mostly desperate | | GGUF | i1-IQ2XXS | 2.5 | | | GGUF | i1-IQ2XS | 2.7 | | | GGUF | i1-IQ2S | 2.9 | | | GGUF | i1-IQ2M | 3.0 | | | GGUF | i1-Q2KS | 3.1 | very low quality | | GGUF | i1-Q2K | 3.3 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 3.4 | lower quality | | GGUF | i1-IQ3XS | 3.6 | | | GGUF | i1-Q3KS | 3.8 | IQ3XS probably better | | GGUF | i1-IQ3S | 3.8 | beats Q3K | | GGUF | i1-IQ3M | 3.9 | | | GGUF | i1-Q3KM | 4.1 | IQ3S probably better | | GGUF | i1-Q3KL | 4.4 | IQ3M probably better | | GGUF | i1-IQ4XS | 4.5 | | | GGUF | i1-Q40 | 4.8 | fast, low quality | | GGUF | i1-IQ4NL | 4.8 | prefer IQ4XS | | GGUF | i1-Q4KS | 4.8 | optimal size/speed/quality | | GGUF | i1-Q4KM | 5.0 | fast, recommended | | GGUF | i1-Q41 | 5.2 | | | GGUF | i1-Q5KS | 5.7 | | | GGUF | i1-Q5KM | 5.8 | | | GGUF | i1-Q6K | 6.7 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
Assistant_Pepe_70B-i1-GGUF
Qwen3.5-122B-Turkish-Reasoning-6shard-i1-GGUF
Llama3_3-Nemo-Super-Writer-49B-i1-GGUF
L3.3-MS-Nevoria-70b-heretic-i1-GGUF
Qwen3.5-27B-Writer-i1-GGUF
Pokemon-Red-Qwen3-80B-i1-GGUF
MN-Violet-Lotus-12B-GGUF
OpenAI-gpt-oss-20B-GPT5.1-5.2-DISTILL-Heretic-Uncensored-MXFP4-i1-GGUF
Apertus-70B-Instruct-2509-heretic-v3-i1-GGUF
Deepseeker-Kunou-Qwen2.5-14b-i1-GGUF
weighted/imatrix quants of https://huggingface.co/Statuo/Deepseeker-Kunou-Qwen2.5-14b static quants are available at https://huggingface.co/mradermacher/Deepseeker-Kunou-Qwen2.5-14b-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | i1-IQ1S | 3.7 | for the desperate | | GGUF | i1-IQ1M | 4.0 | mostly desperate | | GGUF | i1-IQ2XXS | 4.4 | | | GGUF | i1-IQ2XS | 4.8 | | | GGUF | i1-IQ2S | 5.1 | | | GGUF | i1-IQ2M | 5.5 | | | GGUF | i1-Q2KS | 5.5 | very low quality | | GGUF | i1-Q2K | 5.9 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 6.0 | lower quality | | GGUF | i1-IQ3XS | 6.5 | | | GGUF | i1-Q3KS | 6.8 | IQ3XS probably better | | GGUF | i1-IQ3S | 6.8 | beats Q3K | | GGUF | i1-IQ3M | 7.0 | | | GGUF | i1-Q3KM | 7.4 | IQ3S probably better | | GGUF | i1-Q3KL | 8.0 | IQ3M probably better | | GGUF | i1-IQ4XS | 8.2 | | | GGUF | i1-Q40 | 8.6 | fast, low quality | | GGUF | i1-IQ4NL | 8.6 | prefer IQ4XS | | GGUF | i1-Q4KS | 8.7 | optimal size/speed/quality | | GGUF | i1-Q4KM | 9.1 | fast, recommended | | GGUF | i1-Q41 | 9.5 | | | GGUF | i1-Q5KS | 10.4 | | | GGUF | i1-Q5KM | 10.6 | | | GGUF | i1-Q6K | 12.2 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
ClinAligh-30B-A3B-i1-GGUF
Trickster-Theta-4-70B-i1-GGUF
Magnum-Opus-35B-A3B-i1-GGUF
Crow-9B-Opus-4.6-Distill-Heretic_Qwen3.5-i1-GGUF
gemma-4-31b-it-heretic-ara-i1-GGUF
aum-1-70B-i1-GGUF
gpt-oss-120b-tainted-heresy-i1-GGUF
michaelwaves-Amoral-GPT-OSS-112E-i1-GGUF
deepsex-34b-GGUF
Huihui-Qwen3-Coder-Next-abliterated-i1-GGUF
Trinity-Large-TrueBase-i1-GGUF
Dawn-Max-i1-GGUF
MedQWEN-2.5-32B-i1-GGUF
Qwen3.5-27B-ultra-uncensored-heretic-v1-i1-GGUF
PE-Type-3-Nova-4B-i1-GGUF
ELM-gpt-oss-20b-NSFW-v0.1-i1-GGUF
Qwen3.5-27B-DS9-i1-GGUF
MiroThinker-1.7-i1-GGUF
lynx-instruct-30b-qwen3-i1-GGUF
OpenAI-gpt-oss-20B-INSTRUCT-Heretic-Uncensored-i1-GGUF
PsychAgent-Qwen3-32B-i1-GGUF
Maenad-70B-i1-GGUF
GLM-4.7-REAP-218B-A32B-i1-GGUF
MiniMax-M2.1-REAP-30-i1-GGUF
gemma-4-19b-a4b-it-REAP-i1-GGUF
Qwen3.5-122B-A10B-abliterated-v1-i1-GGUF
MiniMax-M2-REAP-139B-A10B-i1-GGUF
Void-Citrus-L3.3-70B-i1-GGUF
Monika-122B-i1-GGUF
MiniMax-M2.1-REAP-172B-A10B-i1-GGUF
Rio-3.0-Open-Search-i1-GGUF
Samantha-big-MoE-i1-GGUF
SafeWork-R1-DeepSeek-70B-i1-GGUF
zen4-coder-i1-GGUF
Qwen3-Next-448E-Abliterated-Instruct-i1-GGUF
AReaL-tau2-merge-sft-235B-i1-GGUF
Qwen3-Coder-30B-A3B-Instruct-Heretic-i1-GGUF
Step-3.5-Flash-i1-GGUF
schonsense_70B_thinkthonk-i1-GGUF
Chronos-Gold-12B-1.0-i1-GGUF
Qwen2.5-32B-Instruct-heretic-i1-GGUF
Qwen3.5-27B-ultra-uncensored-heretic-v2-i1-GGUF
70B_Imperious-i1-GGUF
magnum-v4-12b-GGUF
Llama-3.3-70B-Instruct-abliterated-v2-i1-GGUF
BlenderCartel-llama33-70B-Pt2-i1-GGUF
MiniMax-M2-THRIFT-55-i1-GGUF
GLM-4.6V-i1-GGUF
Apertus-70B-Instruct-2509-heretic-v1-i1-GGUF
PE-Type-1-Vera-4B-i1-GGUF
Qwen3.5-35B-A3B-heretic-v2-GGUF
zen4-thinking-i1-GGUF
Nanonets-OCR2-3B-GGUF
static quants of https://huggingface.co/nanonets/Nanonets-OCR2-3B For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/Nanonets-OCR2-3B-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | mmproj-Q80 | 0.9 | multi-modal supplement | | GGUF | Q2K | 1.4 | | | GGUF | mmproj-f16 | 1.4 | multi-modal supplement | | GGUF | Q3KS | 1.6 | | | GGUF | Q3KM | 1.7 | lower quality | | GGUF | Q3KL | 1.8 | | | GGUF | IQ4XS | 1.9 | | | GGUF | Q4KS | 1.9 | fast, recommended | | GGUF | Q4KM | 2.0 | fast, recommended | | GGUF | Q5KS | 2.3 | | | GGUF | Q5KM | 2.3 | | | GGUF | Q6K | 2.6 | very good quality | | GGUF | Q80 | 3.4 | fast, best quality | | GGUF | f16 | 6.3 | 16 bpw, overkill | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
qwen-3.5-122B-uncensored-stxt-i1-GGUF
MiniMax-M2.5-CARVE-v1-BF16-i1-GGUF
Ina-v11.1-i1-GGUF
locai-l1-large-2011-i1-GGUF
Cogidonia-24B-i1-GGUF
Neuron-14B-i1-GGUF
Golem-70B-v1a-i1-GGUF
Gemini-3-Pro-Qwen3.5-35B-A3B-i1-GGUF
zen3-nano-i1-GGUF
MiniMax-M2.5-REAP-139B-A10B-i1-GGUF
gpt-oss-20b-uncensored-bf16-GGUF
static quants of https://huggingface.co/huizimao/gpt-oss-20b-uncensored-bf16 For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/gpt-oss-20b-uncensored-bf16-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q3KS | 12.2 | | | GGUF | Q2K | 12.2 | | | GGUF | IQ4XS | 12.3 | | | GGUF | Q3KM | 13.0 | lower quality | | GGUF | Q3KL | 13.4 | | | GGUF | Q4KS | 14.8 | fast, recommended | | GGUF | Q4KM | 15.9 | fast, recommended | | GGUF | Q5KS | 16.0 | | | GGUF | Q5KM | 17.0 | | | GGUF | Q6K | 22.3 | very good quality | | GGUF | Q80 | 22.4 | fast, best quality | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Qwen3.5-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-i1-GGUF
Kimi-Linear-48B-A3B-Instruct-i1-GGUF
Qwen2.5-VL-7B-Instruct-abliterated-GGUF
static quants of https://huggingface.co/huihui-ai/Qwen2.5-VL-7B-Instruct-abliterated For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/Qwen2.5-VL-7B-Instruct-abliterated-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | mmproj-Q80 | 1.0 | multi-modal supplement | | GGUF | mmproj-f16 | 1.5 | multi-modal supplement | | GGUF | Q2K | 3.1 | | | GGUF | Q3KS | 3.6 | | | GGUF | Q3KM | 3.9 | lower quality | | GGUF | Q3KL | 4.2 | | | GGUF | IQ4XS | 4.4 | | | GGUF | Q4KS | 4.6 | fast, recommended | | GGUF | Q4KM | 4.8 | fast, recommended | | GGUF | Q5KS | 5.4 | | | GGUF | Q5KM | 5.5 | | | GGUF | Q6K | 6.4 | very good quality | | GGUF | Q80 | 8.2 | fast, best quality | | GGUF | f16 | 15.3 | 16 bpw, overkill | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Gradients-Covenant-V1-i1-GGUF
magnum-v4-22b-i1-GGUF
trohrbaugh-Qwen3.5-122B-A10B-heretic-i1-GGUF
Qwen3-235B-A22B-abliterated-i1-GGUF
jina-reranker-v1-tiny-en-GGUF
Rukun-Qwen-32B-i1-GGUF
GLM-4.7-Flash-ultimate-irrefusable-heretic-i1-GGUF
PE-Type-2-Alma-4B-i1-GGUF
Qwen-3.5-10.5B-Frankenmerge-Opus-4.6-Distill-i1-GGUF
MiniMax-M2.5-i1-GGUF
70B_llama33_stock_unslop-i1-GGUF
Hypnos-i1-8B-i1-GGUF
GUI-Owl-1.5-32B-Instruct-i1-GGUF
DeepSeek-R1-Distill-Qwen-14B-Uncensored-GGUF
static quants of https://huggingface.co/nicoboss/DeepSeek-R1-Distill-Qwen-14B-Uncensored weighted/imatrix quants are available at https://huggingface.co/mradermacher/DeepSeek-R1-Distill-Qwen-14B-Uncensored-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q2K | 5.9 | | | GGUF | Q3KS | 6.8 | | | GGUF | Q3KM | 7.4 | lower quality | | GGUF | Q3KL | 8.0 | | | GGUF | IQ4XS | 8.3 | | | GGUF | Q4KS | 8.7 | fast, recommended | | GGUF | Q4KM | 9.1 | fast, recommended | | GGUF | Q5KS | 10.4 | | | GGUF | Q5KM | 10.6 | | | GGUF | Q6K | 12.2 | very good quality | | GGUF | Q80 | 15.8 | fast, best quality | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Mixtral-8x7B-Instruct-v0.1-GGUF
Qwen3.5-35B-A3B-heretic-v2-eq-v1-i1-GGUF
WeirdCompound-v1.7-24b-i1-GGUF
weighted/imatrix quants of https://huggingface.co/FlareRebellion/WeirdCompound-v1.7-24b For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/WeirdCompound-v1.7-24b-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.1 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 5.4 | for the desperate | | GGUF | i1-IQ1M | 5.9 | mostly desperate | | GGUF | i1-IQ2XXS | 6.6 | | | GGUF | i1-IQ2XS | 7.3 | | | GGUF | i1-IQ2S | 7.6 | | | GGUF | i1-IQ2M | 8.2 | | | GGUF | i1-Q2KS | 8.4 | very low quality | | GGUF | i1-Q2K | 9.0 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 9.4 | lower quality | | GGUF | i1-IQ3XS | 10.0 | | | GGUF | i1-Q3KS | 10.5 | IQ3XS probably better | | GGUF | i1-IQ3S | 10.5 | beats Q3K | | GGUF | i1-IQ3M | 10.8 | | | GGUF | i1-Q3KM | 11.6 | IQ3S probably better | | GGUF | i1-Q3KL | 12.5 | IQ3M probably better | | GGUF | i1-IQ4XS | 12.9 | | | GGUF | i1-Q40 | 13.6 | fast, low quality | | GGUF | i1-Q4KS | 13.6 | optimal size/speed/quality | | GGUF | i1-Q4KM | 14.4 | fast, recommended | | GGUF | i1-Q41 | 15.0 | | | GGUF | i1-Q5KS | 16.4 | | | GGUF | i1-Q5KM | 16.9 | | | GGUF | i1-Q6K | 19.4 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
OmniDimen-2-20B-Emotion-i1-GGUF
Qwen2.5-VL-7B-Abliterated-Caption-it-GGUF
static quants of https://huggingface.co/prithivMLmods/Qwen2.5-VL-7B-Abliterated-Caption-it For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/Qwen2.5-VL-7B-Abliterated-Caption-it-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | mmproj-Q80 | 1.0 | multi-modal supplement | | GGUF | mmproj-f16 | 1.5 | multi-modal supplement | | GGUF | Q2K | 3.1 | | | GGUF | Q3KS | 3.6 | | | GGUF | Q3KM | 3.9 | lower quality | | GGUF | Q3KL | 4.2 | | | GGUF | IQ4XS | 4.4 | | | GGUF | Q4KS | 4.6 | fast, recommended | | GGUF | Q4KM | 4.8 | fast, recommended | | GGUF | Q5KS | 5.4 | | | GGUF | Q5KM | 5.5 | | | GGUF | Q6K | 6.4 | very good quality | | GGUF | Q80 | 8.2 | fast, best quality | | GGUF | f16 | 15.3 | 16 bpw, overkill | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Frank-27B-i1-GGUF
Strawberrylemonade-L3-70B-v1.2-heretic2-i1-GGUF
Monika-70B-i1-GGUF
qwen35-122b-memorai-v10-sft-i1-GGUF
Qwen3.5-35B-A3B-Uncensored-Aggressive-safetensors-i1-GGUF
MARTHA-9B-i1-GGUF
zen4-i1-GGUF
Qwen3.5-27B-heretic-v3-i1-GGUF
gpt2-alpaca-gpt4-GGUF
Simsema_Small-4-119B-32226-i1-GGUF
Qwen3.5-9B-heretic-i1-GGUF
gpt-oss-20b-gemini-2.5-pro-distill-i1-GGUF
Qwen2.5-Coder-14B-Abliterated-i1-GGUF
gemma-4-26B-A4B-it-heretic-ara-GGUF
Llama-70B-God-Tier-i1-GGUF
BereavedCompound-v1.0-24b-i1-GGUF
Qwen3.5-9B-ultra-heretic-i1-GGUF
mox-tiny-1-i1-GGUF
gemma-4-31B-it-heretic-GGUF
turkish-llm-14b-instruct-i1-GGUF
Dirty-Muse-Writer-v01-Uncensored-Erotica-NSFW-GGUF
Mars_27B_V.1-i1-GGUF
MiniMax-M2-REAP-162B-A10B-i1-GGUF
Smilodon-9B-v1-i1-GGUF
Gemma3-27B-it-vl-GLM-4.7-Uncensored-Heretic-Deep-Reasoning-i1-GGUF
Poe-8B-GLM5-Opus4.6-Sonnet4.5-Kimi-Grok-Gemini-3-pro-preview-HERETIC-i1-GGUF
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF
MiniMax-M2.1-REAP-50-i1-GGUF
Delorme_1-OCR-7B-Post1.0-i1-GGUF
Goetia-24B-v1.1-i1-GGUF
Qwen-3.5-27B-Derestricted-i1-GGUF
Qwen3-VL-235B-A22B-Thinking-heretic-i1-GGUF
Qwen3-VL-8B-Abliterated-Caption-it-i1-GGUF
weighted/imatrix quants of https://huggingface.co/prithivMLmods/Qwen3-VL-8B-Abliterated-Caption-it For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/Qwen3-VL-8B-Abliterated-Caption-it-GGUF This is a vision model - mmproj files (if any) will be in the static repository. Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.1 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 2.2 | for the desperate | | GGUF | i1-IQ1M | 2.4 | mostly desperate | | GGUF | i1-IQ2XXS | 2.6 | | | GGUF | i1-IQ2XS | 2.8 | | | GGUF | i1-IQ2S | 3.0 | | | GGUF | i1-IQ2M | 3.2 | | | GGUF | i1-Q2KS | 3.2 | very low quality | | GGUF | i1-Q2K | 3.4 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 3.5 | lower quality | | GGUF | i1-IQ3XS | 3.7 | | | GGUF | i1-Q3KS | 3.9 | IQ3XS probably better | | GGUF | i1-IQ3S | 3.9 | beats Q3K | | GGUF | i1-IQ3M | 4.0 | | | GGUF | i1-Q3KM | 4.2 | IQ3S probably better | | GGUF | i1-Q3KL | 4.5 | IQ3M probably better | | GGUF | i1-IQ4XS | 4.7 | | | GGUF | i1-Q40 | 4.9 | fast, low quality | | GGUF | i1-IQ4NL | 4.9 | prefer IQ4XS | | GGUF | i1-Q4KS | 4.9 | optimal size/speed/quality | | GGUF | i1-Q4KM | 5.1 | fast, recommended | | GGUF | i1-Q41 | 5.3 | | | GGUF | i1-Q5KS | 5.8 | | | GGUF | i1-Q5KM | 6.0 | | | GGUF | i1-Q6K | 6.8 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
Hathor_Sofit-L3-8B-v1-GGUF
IoGPT-A1-i1-GGUF
TitanForge-8B-i1-GGUF
metatune-gpt20b-R1.2-i1-GGUF
Atlas-72B-SVT-merged-i1-GGUF
Nemo-Humanities-i1-GGUF
L3-8B-Stheno-v3.2-i1-GGUF
Huihui-Tongyi-DeepResearch-30B-A3B-abliterated-i1-GGUF
Josiefied-Qwen2.5-Coder-7B-Instruct-abliterated-v1-i1-GGUF
Mistral-Nemo-Batman-Venom-i1-GGUF
gemma-3-4b-it-heretic-uncensored-abliterated-Extreme-i1-GGUF
HER-RM-32B-i1-GGUF
Huihui-MiroThinker-v1.0-72B-abliterated-i1-GGUF
Huihui-Qwen3-Coder-30B-A3B-Instruct-abliterated-i1-GGUF
Llama-3.3-8B-Instruct-OmniWriter-i1-GGUF
Llama-3-70B-Instruct-abliterated-v3-i1-GGUF
Precog-24B-v1-i1-GGUF
mistralai_Ministral-3-8B-Instruct-2512-abliterated-i1-GGUF
Qwen3.5-27B_Homebrew-i1-GGUF
Lumimaid-v0.2-70B-heretic-i1-GGUF
Qwen3.5-9B-abliterated-i1-GGUF
MS3.2-PaintedFantasy-v3-24B-i1-GGUF
mox-tiny-1-GGUF
KorReason-35B-Darwin-i1-GGUF
gpt-oss-4B-i1-GGUF
zen4-mini-i1-GGUF
Forsaken-Void-12B-i1-GGUF
PyGenius1F-i1-GGUF
Qwen3-42B-A3B-2507-Thinking-TOTAL-RECALL-v2-Medium-MASTER-CODER-GGUF
static quants of https://huggingface.co/DavidAU/Qwen3-42B-A3B-2507-Thinking-TOTAL-RECALL-v2-Medium-MASTER-CODER For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/Qwen3-42B-A3B-2507-Thinking-TOTAL-RECALL-v2-Medium-MASTER-CODER-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q2K | 15.7 | | | GGUF | Q3KS | 18.5 | | | GGUF | Q3KM | 20.5 | lower quality | | GGUF | Q3KL | 22.1 | | | GGUF | IQ4XS | 23.0 | | | GGUF | Q4KS | 24.3 | fast, recommended | | GGUF | Q4KM | 25.8 | fast, recommended | | GGUF | Q5KS | 29.3 | | | GGUF | Q5KM | 30.2 | | | GGUF | Q6K | 34.9 | very good quality | | GGUF | Q80 | 45.2 | fast, best quality | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Qwen3-42B-A3B-2507-Thinking-Abliterated-uncensored-TOTAL-RECALL-v2-Medium-MASTER-CODER-i1-GGUF
weighted/imatrix quants of https://huggingface.co/DavidAU/Qwen3-42B-A3B-2507-Thinking-Abliterated-uncensored-TOTAL-RECALL-v2-Medium-MASTER-CODER For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/Qwen3-42B-A3B-2507-Thinking-Abliterated-uncensored-TOTAL-RECALL-v2-Medium-MASTER-CODER-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.3 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 8.9 | for the desperate | | GGUF | i1-IQ1M | 9.8 | mostly desperate | | GGUF | i1-IQ2XXS | 11.4 | | | GGUF | i1-IQ2XS | 12.6 | | | GGUF | i1-IQ2S | 12.9 | | | GGUF | i1-IQ2M | 14.1 | | | GGUF | i1-Q2KS | 14.6 | very low quality | | GGUF | i1-Q2K | 15.7 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 16.5 | lower quality | | GGUF | i1-IQ3XS | 17.5 | | | GGUF | i1-Q3KS | 18.5 | IQ3XS probably better | | GGUF | i1-IQ3S | 18.5 | beats Q3K | | GGUF | i1-IQ3M | 18.8 | | | GGUF | i1-Q3KM | 20.5 | IQ3S probably better | | GGUF | i1-Q3KL | 22.1 | IQ3M probably better | | GGUF | i1-IQ4XS | 22.8 | | | GGUF | i1-Q40 | 24.2 | fast, low quality | | GGUF | i1-Q4KS | 24.3 | optimal size/speed/quality | | GGUF | i1-Q4KM | 25.8 | fast, recommended | | GGUF | i1-Q41 | 26.7 | | | GGUF | i1-Q5KS | 29.3 | | | GGUF | i1-Q5KM | 30.2 | | | GGUF | i1-Q6K | 34.9 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
atom-27b-i1-GGUF
AutoGLM-Phone-9B-i1-GGUF
MiniMax-M2.1-REAP-40-i1-GGUF
Crow-4B-Opus-4.6-Distill-Heretic_Qwen3.5-i1-GGUF
GRM2-3b-i1-GGUF
Qwen3.5-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-GGUF
LemonKunoichiWizardV3-GGUF
Huihui-Qwen3-30B-A3B-Instruct-2507-abliterated-GGUF
static quants of https://huggingface.co/huihui-ai/Huihui-Qwen3-30B-A3B-Instruct-2507-abliterated For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/Huihui-Qwen3-30B-A3B-Instruct-2507-abliterated-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q2K | 11.4 | | | GGUF | Q3KS | 13.4 | | | GGUF | Q3KM | 14.8 | lower quality | | GGUF | Q3KL | 16.0 | | | GGUF | IQ4XS | 16.7 | | | GGUF | Q4KS | 17.6 | fast, recommended | | GGUF | Q4KM | 18.7 | fast, recommended | | GGUF | Q5KS | 21.2 | | | GGUF | Q5KM | 21.8 | | | GGUF | Q6K | 25.2 | very good quality | | GGUF | Q80 | 32.6 | fast, best quality | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Melinoe-30B-A3B-Thinking-i1-GGUF
MarsRL-i1-GGUF
zen-vl-30b-instruct-i1-GGUF
tavern-sensei-qwen3.5-35B-A3B-i1-GGUF
gpt-oss-safeguard-20b-kor-enterprise-i1-GGUF
MediumAGI-V2-i1-GGUF
Melinoe-gpt-oss-21B-A3.6B-Diluted-i1-GGUF
Jade-20B-i1-GGUF
Suri-Qwen-3.5-4B-Uncensored-i1-GGUF
Yanfei-v2-SamCool-i1-GGUF
Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated-GGUF
GPT-OSS-Swallow-20B-SFT-v0.1-i1-GGUF
Qwen3-VL-8B-Instruct-Heretic-i1-GGUF
gemma-3-uncensored-i1-GGUF
SynthAgent-SFT-UI-TARS-1.5-7B-i1-GGUF
Grok-3-reasoning-gemma3-12B-distilled-HF-GGUF
AuroEtherealKrix-12B-i1-GGUF
gpt-oss-20b-science_full_v1-i1-GGUF
Mira-v1.12.1-27B-i1-GGUF
Total04-DeepSeek-R1-Distill-Llama-70B-heretic-i1-GGUF
Gemma-3-27B-Derestricted-i1-GGUF
Llama-3.2-3B-Instruct-uncensored-GGUF
static quants of https://huggingface.co/chuanli11/Llama-3.2-3B-Instruct-uncensored weighted/imatrix quants are available at https://huggingface.co/mradermacher/Llama-3.2-3B-Instruct-uncensored-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q2K | 1.6 | | | GGUF | IQ3XS | 1.7 | | | GGUF | IQ3S | 1.8 | beats Q3K | | GGUF | Q3KS | 1.8 | | | GGUF | IQ3M | 1.9 | | | GGUF | Q3KM | 2.0 | lower quality | | GGUF | Q3KL | 2.1 | | | GGUF | IQ4XS | 2.2 | | | GGUF | Q4KS | 2.2 | fast, recommended | | GGUF | Q4KM | 2.3 | fast, recommended | | GGUF | Q5KS | 2.6 | | | GGUF | Q5KM | 2.7 | | | GGUF | Q6K | 3.1 | very good quality | | GGUF | Q80 | 3.9 | fast, best quality | | GGUF | f16 | 7.3 | 16 bpw, overkill | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Irix-12B-Model_Stock-i1-GGUF
gemma-3-1b-it-heretic-extreme-uncensored-abliterated-i1-GGUF
llama-joycaption-beta-one-hf-llava-GGUF
BlenderCartel-llama33-70B-Pt1-i1-GGUF
Ministral-8B-Instruct-2410-sft-i1-GGUF
gpt-oss-20b-Derestricted-i1-GGUF
Qwen3.5-4B_Homebrew-i1-GGUF
SEALION-it-Lafaek-8B-ococosda-i1-GGUF
GPT-OSS-Swallow-120B-RL-v0.1-GGUF
atom-80b-i1-GGUF
Qwen3.5-27B-heretic-GGUF
Huihui-GLM-4.5V-abliterated-i1-GGUF
Step-3.5-Flash-REAP-149B-A11B-i1-GGUF
Suri-Qwen-3.5-9B-Uncensored-i1-GGUF
Mistral-Nemo-Instruct-2407-absolute-heresy-i1-GGUF
abirdus-12b-instruct-s0-i1-GGUF
SVD-Qwen3-Coder-Next-Thinking-i1-GGUF
seed-oss-36b-chat-i1-GGUF
Ken1.0-67B-i1-GGUF
MARTHA-73B-Qwen2-VL-i1-GGUF
XortronCriminalComputingConfig-i1-GGUF
Austral-24b-GRPO-i1-GGUF
L3-SthenoMaidBlackroot-8B-V1-GGUF
SP-7B-i1-GGUF
llama4-dolphin-8B-GGUF
Qwen3-Next-80B-A3B-Instruct-i1-GGUF
MiMo-V2-Flash-i1-GGUF
Sapphira-L3.3-70b-0.2-GGUF
static quants of https://huggingface.co/BruhzWater/Sapphira-L3.3-70b-0.2 For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/Sapphira-L3.3-70b-0.2-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q2K | 26.5 | | | GGUF | Q3KS | 31.0 | | | GGUF | Q3KM | 34.4 | lower quality | | GGUF | Q3KL | 37.2 | | | GGUF | IQ4XS | 38.4 | | | GGUF | Q4KS | 40.4 | fast, recommended | | GGUF | Q4KM | 42.6 | fast, recommended | | GGUF | Q5KS | 48.8 | | | GGUF | Q5KM | 50.0 | | | PART 1 PART 2 | Q6K | 58.0 | very good quality | | PART 1 PART 2 | Q80 | 75.1 | fast, best quality | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Luna-Qwen3.5-27B-v5-i1-GGUF
glm4.1v-9b-base-sft-i1-GGUF
weighted/imatrix quants of https://huggingface.co/bountyhunterxx/glm4.1v-9b-base-sft For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/glm4.1v-9b-base-sft-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.1 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 3.2 | for the desperate | | GGUF | i1-IQ1M | 3.3 | mostly desperate | | GGUF | i1-IQ2XXS | 3.5 | | | GGUF | i1-IQ2XS | 3.7 | | | GGUF | i1-IQ2S | 3.9 | | | GGUF | i1-IQ2M | 4.0 | | | GGUF | i1-Q2KS | 4.1 | very low quality | | GGUF | i1-Q2K | 4.1 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 4.3 | lower quality | | GGUF | i1-IQ3XS | 4.5 | | | GGUF | i1-Q3KS | 4.7 | IQ3XS probably better | | GGUF | i1-IQ3S | 4.7 | beats Q3K | | GGUF | i1-IQ3M | 4.8 | | | GGUF | i1-Q3KM | 5.1 | IQ3S probably better | | GGUF | i1-Q3KL | 5.3 | IQ3M probably better | | GGUF | i1-IQ4XS | 5.4 | | | GGUF | i1-IQ4NL | 5.6 | prefer IQ4XS | | GGUF | i1-Q40 | 5.6 | fast, low quality | | GGUF | i1-Q4KS | 5.9 | optimal size/speed/quality | | GGUF | i1-Q41 | 6.1 | | | GGUF | i1-Q4KM | 6.3 | fast, recommended | | GGUF | i1-Q5KS | 6.8 | | | GGUF | i1-Q5KM | 7.2 | | | GGUF | i1-Q6K | 8.4 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
TranslateGemma-4B-i1-GGUF
Poe-8b-TOP10-Distill-Heretic-Full-i1-GGUF
INTELLECT-3V-i1-GGUF
Qwen3.5-9B-Casual-Thinker-i1-GGUF
Llama-3.3-70B-Instruct-heretic-i1-GGUF
Gemma-2-Ataraxy-v4d-9B-i1-GGUF
amoral-gemma3-12B-v1-i1-GGUF
NVIDIA-Nemotron-3-Super-120B-A12B-BF16-heretic-i1-GGUF
Fimbulvetr-11B-v2-GGUF
GLM-4.7-REAP-268B-A32B-i1-GGUF
XORTRON.CriminalComputing.Q35xC46-i1-GGUF
Huihui-Qwen3-4B-Instruct-2507-abliterated-GGUF
HER-32B-i1-GGUF
Covenant72B-ChatML-bf16-i1-GGUF
MiMo-V2-Flash-Base-i1-GGUF
Diver-GroupRank-7B-i1-GGUF
GlotMAX-101-14B-i1-GGUF
brayniac-Qwen3.5-27B-heretic-i1-GGUF
zen4-coder-pro-i1-GGUF
AgentDoG-FG-Llama3.1-8B-i1-GGUF
Qwen3-30B-A3B-YOYO-AutoThink-i1-GGUF
Step-3.5-Flash-REAP-121B-A11B-i1-GGUF
Hulu-Med-235A22-i1-GGUF
Qwen3.5-27B-heretic-v3-GGUF
Holo2-235B-A22B-i1-GGUF
Luna-Qwen3.5-4B-v5-i1-GGUF
Arjuna-8B-i1-GGUF
Olmo-3-32B-Think-i1-GGUF
nemotron-medical-tuned-70b-i1-GGUF
Huihui-MiroThinker-v1.0-30B-abliterated-i1-GGUF
Qwen3-4B-Claude-Sonnet-4-Reasoning-Distill-Safetensor-GGUF
static quants of https://huggingface.co/Liontix/Qwen3-4B-Claude-Sonnet-4-Reasoning-Distill-Safetensor For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants seem not to be available (by me) at this time. If they do not show up a week or so after the static ones, I have probably not planned for them. Feel free to request them by opening a Community Discussion. Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q2K | 1.8 | | | GGUF | Q3KS | 2.0 | | | GGUF | Q3KM | 2.2 | lower quality | | GGUF | Q3KL | 2.3 | | | GGUF | IQ4XS | 2.4 | | | GGUF | Q4KS | 2.5 | fast, recommended | | GGUF | Q4KM | 2.6 | fast, recommended | | GGUF | Q5KS | 2.9 | | | GGUF | Q5KM | 3.0 | | | GGUF | Q6K | 3.4 | very good quality | | GGUF | Q80 | 4.4 | fast, best quality | | GGUF | f16 | 8.2 | 16 bpw, overkill | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Meta-Llama-3.1-70B-Instruct-Malaysian-i1-GGUF
Gliese-OCR-7B-Post2.0-final-i1-GGUF
weighted/imatrix quants of https://huggingface.co/prithivMLmods/Gliese-OCR-7B-Post2.0-final For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/Gliese-OCR-7B-Post2.0-final-GGUF This is a vision model - mmproj files (if any) will be in the static repository. Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.1 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 2.0 | for the desperate | | GGUF | i1-IQ1M | 2.1 | mostly desperate | | GGUF | i1-IQ2XXS | 2.4 | | | GGUF | i1-IQ2XS | 2.6 | | | GGUF | i1-IQ2S | 2.7 | | | GGUF | i1-IQ2M | 2.9 | | | GGUF | i1-Q2KS | 2.9 | very low quality | | GGUF | i1-Q2K | 3.1 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 3.2 | lower quality | | GGUF | i1-IQ3XS | 3.4 | | | GGUF | i1-Q3KS | 3.6 | IQ3XS probably better | | GGUF | i1-IQ3S | 3.6 | beats Q3K | | GGUF | i1-IQ3M | 3.7 | | | GGUF | i1-Q3KM | 3.9 | IQ3S probably better | | GGUF | i1-Q3KL | 4.2 | IQ3M probably better | | GGUF | i1-IQ4XS | 4.3 | | | GGUF | i1-IQ4NL | 4.5 | prefer IQ4XS | | GGUF | i1-Q40 | 4.5 | fast, low quality | | GGUF | i1-Q4KS | 4.6 | optimal size/speed/quality | | GGUF | i1-Q4KM | 4.8 | fast, recommended | | GGUF | i1-Q41 | 5.0 | | | GGUF | i1-Q5KS | 5.4 | | | GGUF | i1-Q5KM | 5.5 | | | GGUF | i1-Q6K | 6.4 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
Famino-12B-Model_Stock-i1-GGUF
HERETICSEEK-7B-Ditill-i1-GGUF
Skyfall-31B-v4.1-heretic2-i1-GGUF
Broken-Tutu-24B-Unslop-v2.0-GGUF
Qwen2.5-VL-7B-NSFW-Caption-V4-GGUF
static quants of https://huggingface.co/thesby/Qwen2.5-VL-7B-NSFW-Caption-V4 For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/Qwen2.5-VL-7B-NSFW-Caption-V4-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | mmproj-Q80 | 0.8 | multi-modal supplement | | GGUF | mmproj-f16 | 1.5 | multi-modal supplement | | GGUF | Q2K | 3.2 | | | GGUF | Q3KS | 3.6 | | | GGUF | Q3KM | 4.0 | lower quality | | GGUF | Q3KL | 4.2 | | | GGUF | IQ4XS | 4.4 | | | GGUF | Q4KS | 4.6 | fast, recommended | | GGUF | Q4KM | 4.8 | fast, recommended | | GGUF | Q5KS | 5.4 | | | GGUF | Q5KM | 5.5 | | | GGUF | Q6K | 6.4 | very good quality | | GGUF | Q80 | 8.2 | fast, best quality | | GGUF | f16 | 15.4 | 16 bpw, overkill | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Chaos-Unknown-12b-i1-GGUF
70B_neolithic_rabbit-i1-GGUF
atom-80b-GGUF
The_Creeping_Darkness-X2-16B-i1-GGUF
Llama-3.2-3B-Instruct-heretic-ablitered-uncensored-i1-GGUF
Llama-3-Swallow-8B-Instruct-v0.1-kokoroe-i1-GGUF
Qwen3-Next-80B-A3B-Thinking-i1-GGUF
70B_Triage-i1-GGUF
Qwen2.5-Coder-7B-Abliterated-i1-GGUF
Qwen3-VL-REAP-145B-A22B-i1-GGUF
Harmonic-27B-i1-GGUF
Qwen3.5-Antirep-27B-i1-GGUF
Qwen3-Next-416E-Abliterated-Instruct-i1-GGUF
MiniMax-M2.1-REAP-40-GGUF
SEX_ROLEPLAY-3.2-1B-i1-GGUF
Broken-Tutu-24B-Transgression-v2.0-i1-GGUF
Qwen3-VL-Reranker-8B-GGUF
MiniMax-M2.1-REAP-139B-A10B-i1-GGUF
Scarlet-Seraph-12B-i1-GGUF
Qwen3-Next-80B-A3B-Thinking-GRPO-Uncensored-i1-GGUF
Qwen3-0.6B-Qrazy-Qoder-i1-GGUF
Suri-Qwen-3.5-4B-Uncensored-Low-i1-GGUF
aidc-llm-laos-12b-i1-GGUF
NVIDIA-Nemotron-3-Super-120B-A12B-BF16-i1-GGUF
14B-Qwen2.5-Kunou-v1-GGUF
Solar-Open-100B-i1-GGUF
Olmo-3-7B-RLZero-Mix-i1-GGUF
Llama3.2-24B-A3B-II-Dark-Champion-INSTRUCT-Heretic-Abliterated-Uncensored-i1-GGUF
Qwen2.5-VL-7B-Instruct-GGUF
Emerald-Wyvern-12B-i1-GGUF
Olmo-3-7B-Think-i1-GGUF
gemma-3-27b-it-heretic-v2-i1-GGUF
Qwen3-VL-8B-Instruct-abliterated-v2.0-i1-GGUF
Qwen3.5-4B-heretic-GGUF
OpenAI-gpt-oss-20B-INSTRUCT-Heretic-Uncensored-MXFP4-GGUF
Qwen3.5-27B-heretic-v2-i1-GGUF
DECS_7B-i1-GGUF
Step-3.5-Flash-REAP-128B-A11B-i1-GGUF
Magistry-24B-v1.1-i1-GGUF
wraith-8b-i1-GGUF
weighted/imatrix quants of https://huggingface.co/vanta-research/wraith-8b For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/wraith-8b-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.1 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 2.1 | for the desperate | | GGUF | i1-IQ1M | 2.3 | mostly desperate | | GGUF | i1-IQ2XXS | 2.5 | | | GGUF | i1-IQ2XS | 2.7 | | | GGUF | i1-IQ2S | 2.9 | | | GGUF | i1-IQ2M | 3.0 | | | GGUF | i1-Q2KS | 3.1 | very low quality | | GGUF | i1-Q2K | 3.3 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 3.4 | lower quality | | GGUF | i1-IQ3XS | 3.6 | | | GGUF | i1-Q3KS | 3.8 | IQ3XS probably better | | GGUF | i1-IQ3S | 3.8 | beats Q3K | | GGUF | i1-IQ3M | 3.9 | | | GGUF | i1-Q3KM | 4.1 | IQ3S probably better | | GGUF | i1-Q3KL | 4.4 | IQ3M probably better | | GGUF | i1-IQ4XS | 4.5 | | | GGUF | i1-Q40 | 4.8 | fast, low quality | | GGUF | i1-IQ4NL | 4.8 | prefer IQ4XS | | GGUF | i1-Q4KS | 4.8 | optimal size/speed/quality | | GGUF | i1-Q4KM | 5.0 | fast, recommended | | GGUF | i1-Q41 | 5.2 | | | GGUF | i1-Q5KS | 5.7 | | | GGUF | i1-Q5KM | 5.8 | | | GGUF | i1-Q6K | 6.7 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
KQ_Omni-12B-v1-i1-GGUF
Deepseek-R1-Distill-NSFW-RPv1-GGUF
ASTRA-14B-Thinking-v1-i1-GGUF
Poe-8B-GLM5-Opus4.6-Sonnet4.5-Kimi-Grok-Gemini-3-pro-preview-HERETIC-GGUF
OpenELM-3B-Instruct-GGUF
Qwen_Uncensored-i1-GGUF
hito-1.7b-i1-GGUF
survey-bot-qwen3-vl-32b-i1-GGUF
ASID-Captioner-7B-i1-GGUF
Qwen3-Coder-Next-Base-i1-GGUF
MathSmith-DS-Qwen-7B-LongCoT-i1-GGUF
DeepSeek-V2-Lite-Chat-Uncensored-Unbiased-Reasoner-GGUF
Stellar-Umbra-12B-i1-GGUF
Hulu-Med-30A3-i1-GGUF
L3.2-3B-Herthea-i1-GGUF
GUI-Owl-1.5-8B-Think-i1-GGUF
Trinity-Mini-Base-i1-GGUF
GLM-4.7-Flash-REAP-23B-A3B-absolute-heresy-i1-GGUF
Llama-3.1-EstLLM-8B-0525-i1-GGUF
Tankie-DPE-12b-SFT-i1-GGUF
olmo-v2-stage3-lexifreak-heretic-v1-i1-GGUF
gemma3-27b-abliterated-dpo-i1-GGUF
weighted/imatrix quants of https://huggingface.co/summykai/gemma3-27b-abliterated-dpo For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/gemma3-27b-abliterated-dpo-GGUF This is a vision model - mmproj files (if any) will be in the static repository. Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | i1-IQ1S | 6.4 | for the desperate | | GGUF | i1-IQ1M | 6.9 | mostly desperate | | GGUF | i1-IQ2XXS | 7.8 | | | GGUF | i1-IQ2XS | 8.5 | | | GGUF | i1-IQ2S | 8.9 | | | GGUF | i1-IQ2M | 9.6 | | | GGUF | i1-Q2KS | 9.9 | very low quality | | GGUF | i1-Q2K | 10.6 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 10.8 | lower quality | | GGUF | i1-IQ3XS | 11.7 | | | GGUF | i1-IQ3S | 12.3 | beats Q3K | | GGUF | i1-Q3KS | 12.3 | IQ3XS probably better | | GGUF | i1-IQ3M | 12.6 | | | GGUF | i1-Q3KM | 13.5 | IQ3S probably better | | GGUF | i1-Q3KL | 14.6 | IQ3M probably better | | GGUF | i1-IQ4XS | 14.9 | | | GGUF | i1-Q40 | 15.7 | fast, low quality | | GGUF | i1-Q4KS | 15.8 | optimal size/speed/quality | | GGUF | i1-Q4KM | 16.6 | fast, recommended | | GGUF | i1-Q41 | 17.3 | | | GGUF | i1-Q5KS | 18.9 | | | GGUF | i1-Q5KM | 19.4 | | | GGUF | i1-Q6K | 22.3 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
Mira-v1.16-Ties-27B-i1-GGUF
Chimera-DeepSeek-NSFW-8B-GGUF
Hermes-4-70B-heretic-i1-GGUF
Nemo-2407-Based-Instruct-DeLERP-0.7-12B-i1-GGUF
L3.1-Apluv3-8B-i1-GGUF
proxima-ocr-d.markdown-post3.0.l-i1-GGUF
apertus-12b-healed-s0-i1-GGUF
NemoMix-Unleashed-12B-i1-GGUF
Olmo-3-32B-Think-SFT-i1-GGUF
Llama3.2-30B-A3B-II-Dark-Champion-INSTRUCT-Heretic-Abliterated-Uncensored-i1-GGUF
DeepSeek-R1-Distill-Llama-8B-Abliterated-i1-GGUF
RP-king-12b-i1-GGUF
Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF
Qwopus-MoE-35B-A3B-i1-GGUF
DR-Tulu-8B-i1-GGUF
mox-small-1-i1-GGUF
maya1-i1-GGUF
PE-Type-1-Vera-4B-GGUF
mini-magnum-12b-v1.1-GGUF
Sunlit-Shadow-12B-i1-GGUF
Llama3.1-DeepDilemma-V1-8B-i1-GGUF
weighted/imatrix quants of https://huggingface.co/Yuma42/Llama3.1-DeepDilemma-V1-8B For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/Llama3.1-DeepDilemma-V1-8B-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.1 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 2.1 | for the desperate | | GGUF | i1-IQ1M | 2.3 | mostly desperate | | GGUF | i1-IQ2XXS | 2.5 | | | GGUF | i1-IQ2XS | 2.7 | | | GGUF | i1-IQ2S | 2.9 | | | GGUF | i1-IQ2M | 3.0 | | | GGUF | i1-Q2KS | 3.1 | very low quality | | GGUF | i1-Q2K | 3.3 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 3.4 | lower quality | | GGUF | i1-IQ3XS | 3.6 | | | GGUF | i1-Q3KS | 3.8 | IQ3XS probably better | | GGUF | i1-IQ3S | 3.8 | beats Q3K | | GGUF | i1-IQ3M | 3.9 | | | GGUF | i1-Q3KM | 4.1 | IQ3S probably better | | GGUF | i1-Q3KL | 4.4 | IQ3M probably better | | GGUF | i1-IQ4XS | 4.5 | | | GGUF | i1-Q40 | 4.8 | fast, low quality | | GGUF | i1-IQ4NL | 4.8 | prefer IQ4XS | | GGUF | i1-Q4KS | 4.8 | optimal size/speed/quality | | GGUF | i1-Q4KM | 5.0 | fast, recommended | | GGUF | i1-Q41 | 5.2 | | | GGUF | i1-Q5KS | 5.7 | | | GGUF | i1-Q5KM | 5.8 | | | GGUF | i1-Q6K | 6.7 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
DeepSeek-R1-Distill-Qwen-14B-abliterated-i1-GGUF
ARC-Base-8B-i1-GGUF
Qwen3.5-4B-Claude-Opus-Reasoning-i1-GGUF
MN-CaptainErisNebula-12B-Chimera-v1.1-heretic-uncensored-abliterated-i1-GGUF
Lokis_Veil-8B-i1-GGUF
VITAL-7B-i1-GGUF
littlemonster-reasoning-12B-QKVO-heretic-HF-i1-GGUF
OctoThinker-8B-Long-Base-i1-GGUF
Monika-12B-i1-GGUF
EtherealKrix-12B-i1-GGUF
Llama3.1-70B-Chinese-Chat-GGUF
Qwen3-30B-A3B-abliterated-erotic-i1-GGUF
OLMo-2-1124-13B-Instruct-32k-Context-ChatML-i1-GGUF
Cydonia-v4.1-MS3.2-Magnum-Diamond-24B-i1-GGUF
The_Croupier-3.2-1B-i1-GGUF
ALIA-40b-instruct-2601-i1-GGUF
DynamicRAG-8B-i1-GGUF
Gemma-3-4B-THINKING-i1-GGUF
Huihui-Qwen3-Coder-30B-A3B-Instruct-abliterated-GGUF
static quants of https://huggingface.co/huihui-ai/Huihui-Qwen3-Coder-30B-A3B-Instruct-abliterated For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/Huihui-Qwen3-Coder-30B-A3B-Instruct-abliterated-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q2K | 11.4 | | | GGUF | Q3KS | 13.4 | | | GGUF | Q3KM | 14.8 | lower quality | | GGUF | Q3KL | 16.0 | | | GGUF | IQ4XS | 16.7 | | | GGUF | Q4KS | 17.6 | fast, recommended | | GGUF | Q4KM | 18.7 | fast, recommended | | GGUF | Q5KS | 21.2 | | | GGUF | Q5KM | 21.8 | | | GGUF | Q6K | 25.2 | very good quality | | GGUF | Q80 | 32.6 | fast, best quality | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
VibeThinker-1.5B-i1-GGUF
Ministral-3-3B-Base-2512-i1-GGUF
Monika-24B-i1-GGUF
GeneralChat-Llama3.2-3B-DPO-i1-GGUF
nova-jais-2-70b-v2-i1-GGUF
MiniMax-M2-THRIFT-i1-GGUF
weighted/imatrix quants of https://huggingface.co/VibeStudio/MiniMax-M2-THRIFT For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/MiniMax-M2-THRIFT-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.5 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 35.3 | for the desperate | | GGUF | i1-IQ1M | 39.1 | mostly desperate | | GGUF | i1-IQ2XXS | 45.5 | | | PART 1 PART 2 | i1-IQ2XS | 50.7 | | | PART 1 PART 2 | i1-IQ2S | 51.6 | | | PART 1 PART 2 | i1-IQ2M | 56.7 | | | PART 1 PART 2 | i1-Q2KS | 58.7 | very low quality | | PART 1 PART 2 | i1-Q2K | 63.0 | IQ3XXS probably better | | PART 1 PART 2 | i1-IQ3XXS | 66.5 | lower quality | | PART 1 PART 2 | i1-IQ3XS | 70.6 | | | PART 1 PART 2 | i1-Q3KS | 74.6 | IQ3XS probably better | | PART 1 PART 2 | i1-IQ3S | 74.6 | beats Q3K | | PART 1 PART 2 | i1-IQ3M | 75.6 | | | PART 1 PART 2 | i1-Q3KM | 82.6 | IQ3S probably better | | PART 1 PART 2 | i1-Q3KL | 89.4 | IQ3M probably better | | PART 1 PART 2 | i1-IQ4XS | 92.1 | | | PART 1 PART 2 | i1-Q40 | 97.8 | fast, low quality | | PART 1 PART 2 | i1-Q4KS | 98.2 | optimal size/speed/quality | | PART 1 PART 2 PART 3 | i1-Q4KM | 104.5 | fast, recommended | | PART 1 PART 2 PART 3 | i1-Q41 | 108.2 | | | PART 1 PART 2 PART 3 | i1-Q5KS | 118.9 | | | PART 1 PART 2 PART 3 | i1-Q5KM | 122.5 | | | PART 1 PART 2 PART 3 | i1-Q6K | 141.7 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
Qwen3.5-9B-ultra-heretic-GGUF
Big-Tiger-Gemma-27B-v3-heretic-i1-GGUF
Qwen3.5-27B-ultimate-heretic-i1-GGUF
Ministral-3-8B-Reasoning-2512-i1-GGUF
HereticAggressive-CoT-i1-GGUF
aquif-Spatial-7B-i1-GGUF
abirdus-12b-instruct-i1-GGUF
Qwen3-32B-Uncensored-GGUF
magnum-v4-9b-abliterated-i1-GGUF
Esperpento-1B-i1-GGUF
Qwen3.5-9B-Claude-4.6-OS-INSTRUCT-i1-GGUF
GLM-4.5-Architect-106B-A12B-i1-GGUF
mox-small-1-GGUF
Nomi-1.0-3b-i1-GGUF
Precog-24B-v1-heretic-i1-GGUF
Llama3.1-DeluXeOne-8B-i1-GGUF
Seed-OSS-36B-Instruct-MPOA-v1-i1-GGUF
Olmo-3-7B-Instruct-DPO-i1-GGUF
Jackdaw-30B-A3B-i1-GGUF
WeirdDolphinPersonalityMechanism-Mistral-24B-i1-GGUF
CoPaw-Flash-9B-GGUF
Dans-PersonalityEngine-V1.3.0-24b-i1-GGUF
Qwen3-30B-A3B-Thinking-2507-Gemini-2.5-Flash-Distill-i1-GGUF
Special-Virus-3.2-1B-i1-GGUF
TildeOpen-30b-ENLV-ChatML-instruct-i1-GGUF
Fara-7B-i1-GGUF
Mistral-Small-3_2-24B-Instruct-2506-antislop.v2-i1-GGUF
JOSIE-4B-Thinking-i1-GGUF
GlotMAX-101-8B-i1-GGUF
Qwen3-VL-8B-Interleave-Thinking-i1-GGUF
GigaChat-20B-A3B-instruct-bf16-i1-GGUF
Qwen2.5-VL-7B-V1-i1-GGUF
Bakti-8B-Base-i1-GGUF
Qwen2.5-32B-Cyberpunk-Storyteller-v2-i1-GGUF
Olmo-3-7B-Instruct-SFT-i1-GGUF
Llama-3.3-70B-Instruct-abliterated-v2-GGUF
Huihui-Qwen3-Next-80B-A3B-Instruct-abliterated-GGUF
ALIA-40b-i1-GGUF
Ahma-2-4B-Instruct-i1-GGUF
DeepSeek-V3.1-Nex-N1.1-i1-GGUF
Qwen2.5-7B-Kids-SciFi-i1-GGUF
The_Darkside-16.6B-i1-GGUF
Broken-Tutu-24B-Unslop-v2.0-i1-GGUF
sundae-v716-generate-direct-4b-i1-GGUF
SAI-DeepCoder-14B-Preview-unsloth-v1.0-i1-GGUF
Cicikus_v2_3B-i1-GGUF
PG67A-W-Serum.Test-3.2-1B-i1-GGUF
Gemma-4-31B-Cognitive-Unshackled-GGUF
Qwen3-VL-32B-Instruct-abliterated-v1-i1-GGUF
Unbound-v1.12.0-27B-i1-GGUF
Cicikus-v3-1.4B-i1-GGUF
CodeV-R1-Qwen-7B-i1-GGUF
Qwen-SEA-LION-v4-4B-VL-Magic_decensored-i1-GGUF
Ministral-3-8B-Instruct-2512-tainted-heresy-i1-GGUF
reactor-ai-20b-i1-GGUF
Fyodor-Q3-8B-Instruct-i1-GGUF
sundae-v716-update-direct-4b-i1-GGUF
ClinAligh-4B-i1-GGUF
Hunyuan-MT-Chimera-7B-i1-GGUF
weighted/imatrix quants of https://huggingface.co/tencent/Hunyuan-MT-Chimera-7B For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/Hunyuan-MT-Chimera-7B-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.1 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 1.9 | for the desperate | | GGUF | i1-IQ1M | 2.1 | mostly desperate | | GGUF | i1-IQ2XXS | 2.3 | | | GGUF | i1-IQ2XS | 2.5 | | | GGUF | i1-IQ2S | 2.6 | | | GGUF | i1-IQ2M | 2.8 | | | GGUF | i1-Q2KS | 2.9 | very low quality | | GGUF | i1-Q2K | 3.1 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 3.1 | lower quality | | GGUF | i1-IQ3XS | 3.4 | | | GGUF | i1-Q3KS | 3.5 | IQ3XS probably better | | GGUF | i1-IQ3S | 3.6 | beats Q3K | | GGUF | i1-IQ3M | 3.7 | | | GGUF | i1-Q3KM | 3.9 | IQ3S probably better | | GGUF | i1-Q3KL | 4.2 | IQ3M probably better | | GGUF | i1-IQ4XS | 4.3 | | | GGUF | i1-Q40 | 4.5 | fast, low quality | | GGUF | i1-IQ4NL | 4.5 | prefer IQ4XS | | GGUF | i1-Q4KS | 4.5 | optimal size/speed/quality | | GGUF | i1-Q4KM | 4.7 | fast, recommended | | GGUF | i1-Q41 | 4.9 | | | GGUF | i1-Q5KS | 5.3 | | | GGUF | i1-Q5KM | 5.5 | | | GGUF | i1-Q6K | 6.3 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
llama3-daybreak-lumimaid0.1-8b-hf-GGUF
SADeepCoder-14B-Preview-unsloth-v1.0-i1-GGUF
Qwen3-4B-Thinking-2507-Claude-4.5-Opus-High-Reasoning-Distill-i1-GGUF
DR-Tulu-SFT-8B-i1-GGUF
DiStil-Qwen3-1.7B-uncensored-i1-GGUF
Broken-Tutu-24B-Transgression-v2.0-GGUF
Qwen3-VL-8B-Medical-Extraction-i1-GGUF
Olmo-3-7B-RL-Zero-IF-i1-GGUF
AutoGLM-Phone-9B-Multilingual-i1-GGUF
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-heretic-GGUF
Huihui-Qwen3-VL-32B-Instruct-abliterated-GGUF
static quants of https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-32B-Instruct-abliterated For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/Huihui-Qwen3-VL-32B-Instruct-abliterated-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | mmproj-Q80 | 0.9 | multi-modal supplement | | GGUF | mmproj-f16 | 1.3 | multi-modal supplement | | GGUF | Q2K | 12.4 | | | GGUF | Q3KS | 14.5 | | | GGUF | Q3KM | 16.1 | lower quality | | GGUF | Q3KL | 17.4 | | | GGUF | IQ4XS | 18.0 | | | GGUF | Q4KS | 18.9 | fast, recommended | | GGUF | Q4KM | 19.9 | fast, recommended | | GGUF | Q5KS | 22.7 | | | GGUF | Q5KM | 23.3 | | | GGUF | Q6K | 27.0 | very good quality | | GGUF | Q80 | 34.9 | fast, best quality | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
SynLogic-7B-i1-GGUF
Qwen3-15B-A2B-Base-i1-GGUF
Sunflower-32B-ultravox-merged-ft-salt-instruct-i1-GGUF
Qwen3.5-27B-heretic-v2-GGUF
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-heretic-v2-i1-GGUF
SynLogic-Mix-3-32B-i1-GGUF
Clado-BrowserOS-Action-i1-GGUF
Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated-i1-GGUF
weighted/imatrix quants of https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated-GGUF This is a vision model - mmproj files (if any) will be in the static repository. Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.2 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 6.5 | for the desperate | | GGUF | i1-IQ1M | 7.2 | mostly desperate | | GGUF | i1-IQ2XXS | 8.3 | | | GGUF | i1-IQ2XS | 9.2 | | | GGUF | i1-IQ2S | 9.4 | | | GGUF | i1-IQ2M | 10.3 | | | GGUF | i1-Q2KS | 10.6 | very low quality | | GGUF | i1-Q2K | 11.4 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 11.9 | lower quality | | GGUF | i1-IQ3XS | 12.7 | | | GGUF | i1-Q3KS | 13.4 | IQ3XS probably better | | GGUF | i1-IQ3S | 13.4 | beats Q3K | | GGUF | i1-IQ3M | 13.6 | | | GGUF | i1-Q3KM | 14.8 | IQ3S probably better | | GGUF | i1-Q3KL | 16.0 | IQ3M probably better | | GGUF | i1-IQ4XS | 16.5 | | | GGUF | i1-Q40 | 17.5 | fast, low quality | | GGUF | i1-Q4KS | 17.6 | optimal size/speed/quality | | GGUF | i1-Q4KM | 18.7 | fast, recommended | | GGUF | i1-Q41 | 19.3 | | | GGUF | i1-Q5KS | 21.2 | | | GGUF | i1-Q5KM | 21.8 | | | GGUF | i1-Q6K | 25.2 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
actio-ui-7b-sft-i1-GGUF
Lang2Act-7B-i1-GGUF
LFM2-24B-A2B-abliterated-i1-GGUF
meteor-v4-2048-i1-GGUF
gpt-oss-20b-gemini-2.5-pro-distill-GGUF
static quants of https://huggingface.co/armand0e/gpt-oss-20b-gemini-2.5-pro-distill For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/gpt-oss-20b-gemini-2.5-pro-distill-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q3KS | 12.2 | | | GGUF | Q2K | 12.2 | | | GGUF | IQ4XS | 12.3 | | | GGUF | Q3KM | 13.0 | lower quality | | GGUF | Q3KL | 13.4 | | | GGUF | Q4KS | 14.8 | fast, recommended | | GGUF | Q4KM | 15.9 | fast, recommended | | GGUF | Q5KS | 16.0 | | | GGUF | Q5KM | 17.0 | | | GGUF | Q6K | 22.3 | very good quality | | GGUF | Q80 | 22.4 | fast, best quality | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Pelican1.0-VL-235B-A22B-FC-i1-GGUF
Disctil-Qwen3-1.7B-i1-GGUF
MathSmith-hc-Qwen3-8B-i1-GGUF
STAR1-R1-Distill-8B-i1-GGUF
AutoL2S-Plus-7b-i1-GGUF
Gemma-3-27B-Heretic-i1-GGUF
Magidonia-24B-v4.3-creative-ORPO-v3-i1-GGUF
Qwen-3.5-27B-Derestricted-GGUF
DeepSeek-R1-Distill-Qwen-1.5B-uncensored-GGUF
gemma-4-31B-it-Grand-Horror-X-INTENSE-HERETIC-UNCENSORED-Thinking-i1-GGUF
mistralai-Mistral-Nemo-Instruct-2407-12B-MPOA-v1-i1-GGUF
Llama-4-Scout-17B-16E-Instruct-abliterated-i1-GGUF
Huihui-gpt-oss-20b-BF16-abliterated-v2-GGUF
static quants of https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated-v2 For a convenient overview and download list, visit our model page for this model. weighted/imatrix quants are available at https://huggingface.co/mradermacher/Huihui-gpt-oss-20b-BF16-abliterated-v2-i1-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | Q3KS | 12.2 | | | GGUF | Q2K | 12.2 | | | GGUF | IQ4XS | 12.3 | | | GGUF | Q3KM | 13.0 | lower quality | | GGUF | Q3KL | 13.4 | | | GGUF | Q4KS | 14.8 | fast, recommended | | GGUF | Q4KM | 15.9 | fast, recommended | | GGUF | Q5KS | 16.0 | | | GGUF | Q5KM | 17.0 | | | GGUF | Q6K | 22.3 | very good quality | | GGUF | Q80 | 22.4 | fast, best quality | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
Seed-OSS-36B-Instruct-heretic-i1-GGUF
Ministral-3-3B-Instruct-2512-BF16-i1-GGUF
Olmo-3-1025-7B-i1-GGUF
GRiP-i1-GGUF
RSI-AI-V1.1-GGUF
StrikeGPT-VL-8B-i1-GGUF
Kimi-VL-A3B-Thinking-2506-GGUF
Olmo-3-7B-Think-SFT-i1-GGUF
Llama3-8B-senator-i1-GGUF
Nexura-Gemma2B-i1-GGUF
L3.3-The-Omega-Directive-70B-Unslop-v2.0-GGUF
Mira-v1.17-Karcher-27B-i1-GGUF
CAI-20B-v2-i1-GGUF
Gemma3-Emotional-1B-i1-GGUF
Orion-Qwen3.5-2B-SFT-v2603-v1-i1-GGUF
SwarmMed-14B-v1.2-merged-i1-GGUF
ATLAS-Teach-8B-Instruct-i1-GGUF
weighted/imatrix quants of https://huggingface.co/Arc-Intelligence/ATLAS-8B-Instruct For a convenient overview and download list, visit our model page for this model. static quants are available at https://huggingface.co/mradermacher/ATLAS-Teach-8B-Instruct-GGUF Usage If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files. (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | GGUF | imatrix | 0.1 | imatrix file (for creating your own qwuants) | | GGUF | i1-IQ1S | 2.2 | for the desperate | | GGUF | i1-IQ1M | 2.4 | mostly desperate | | GGUF | i1-IQ2XXS | 2.6 | | | GGUF | i1-IQ2XS | 2.8 | | | GGUF | i1-IQ2S | 3.0 | | | GGUF | i1-IQ2M | 3.2 | | | GGUF | i1-Q2KS | 3.2 | very low quality | | GGUF | i1-Q2K | 3.4 | IQ3XXS probably better | | GGUF | i1-IQ3XXS | 3.5 | lower quality | | GGUF | i1-IQ3XS | 3.7 | | | GGUF | i1-Q3KS | 3.9 | IQ3XS probably better | | GGUF | i1-IQ3S | 3.9 | beats Q3K | | GGUF | i1-IQ3M | 4.0 | | | GGUF | i1-Q3KM | 4.2 | IQ3S probably better | | GGUF | i1-Q3KL | 4.5 | IQ3M probably better | | GGUF | i1-IQ4XS | 4.7 | | | GGUF | i1-Q40 | 4.9 | fast, low quality | | GGUF | i1-IQ4NL | 4.9 | prefer IQ4XS | | GGUF | i1-Q4KS | 4.9 | optimal size/speed/quality | | GGUF | i1-Q4KM | 5.1 | fast, recommended | | GGUF | i1-Q41 | 5.3 | | | GGUF | i1-Q5KS | 5.8 | | | GGUF | i1-Q5KM | 6.0 | | | GGUF | i1-Q6K | 6.8 | practically like static Q6K | Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9 See https://huggingface.co/mradermacher/modelrequests for some answers to questions you might have and/or if you want some other model quantized. I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.