NikolayKozloff

2,259

OpenReasoning-Nemotron-14B-Q4_K_M-GGUF

NikolayKozloff/OpenReasoning-Nemotron-14B-Q4KM-GGUF This model was converted to GGUF format from `nvidia/OpenReasoning-Nemotron-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

347

gpt-oss-20b-uncensored-bf16-Q4_K_M-GGUF

NikolayKozloff/gpt-oss-20b-uncensored-bf16-Q4KM-GGUF This model was converted to GGUF format from `huizimao/gpt-oss-20b-uncensored-bf16` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

269

Gpt Oss 6.0b Specialized All Pruned Moe Only 7 Experts Q8 0 GGUF

NikolayKozloff/gpt-oss-6.0b-specialized-all-pruned-moe-only-7-experts-Q80-GGUF This model was converted to GGUF format from `AmanPriyanshu/gpt-oss-6.0b-specialized-all-pruned-moe-only-7-experts` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

217

Qwen2-7B-Instruct-Q4_K_M-GGUF

178

DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q8_0-GGUF

153

YanoljaNEXT-Rosetta-12B-2510-Q6_K-GGUF

140

gpt-oss-20b-uncensored-bf16-Q2_K-GGUF

NikolayKozloff/gpt-oss-20b-uncensored-bf16-Q2K-GGUF This model was converted to GGUF format from `huizimao/gpt-oss-20b-uncensored-bf16` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

122

granite-4.0-1b-Q8_0-GGUF

NikolayKozloff/granite-4.0-1b-Q80-GGUF This model was converted to GGUF format from `ibm-granite/granite-4.0-1b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

104

granite-4.0-h-1b-Q8_0-GGUF

NikolayKozloff/granite-4.0-h-1b-Q80-GGUF This model was converted to GGUF format from `ibm-granite/granite-4.0-h-1b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

falcon-7b-GGUF

SambaLingo-Russian-Chat-GGUF

YandexGPT-5-Lite-8B-instruct-Q8_0-GGUF

AI21-Jamba-Reasoning-3B-Q8_0-GGUF

NikolayKozloff/AI21-Jamba-Reasoning-3B-Q80-GGUF This model was converted to GGUF format from `ai21labs/AI21-Jamba-Reasoning-3B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

granite-4.0-350m-Q8_0-GGUF

NikolayKozloff/granite-4.0-350m-Q80-GGUF This model was converted to GGUF format from `ibm-granite/granite-4.0-350m` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Dans-PersonalityEngine-V1.3.0-12b-Q5_K_S-GGUF

NikolayKozloff/Dans-PersonalityEngine-V1.3.0-12b-Q5KS-GGUF This model was converted to GGUF format from `PocketDoc/Dans-PersonalityEngine-V1.3.0-12b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-8B-Q8_0-GGUF

Llama-3.1-8B-Instruct-abliterated_via_adapter-Q8_0-GGUF

YanoljaNEXT-Rosetta-12B-2510-Q4_K_M-GGUF

Lexi-Llama-3-8B-Uncensored-Q6_K-GGUF

csmpt7b-Czech-GGUF

YanoljaNEXT-Rosetta-12B-2510-Q5_K_M-GGUF

gemma-2-27b-Q3_K_S-GGUF

granite-4.0-h-350m-Q8_0-GGUF

NikolayKozloff/granite-4.0-h-350m-Q80-GGUF This model was converted to GGUF format from `ibm-granite/granite-4.0-h-350m` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

mGPT-1.3B-georgian-GGUF

Qwen2-7B-Q4_K_M-GGUF

aya-expanse-8b-Q8_0-GGUF

NikolayKozloff/aya-expanse-8b-Q80-GGUF This model was converted to GGUF format from `CohereForAI/aya-expanse-8b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

madlad400-10b-mt-Q8_0-GGUF

DeepSeek-R1-Distill-Qwen-1.5B-Q8_0-GGUF

LFM2-8B-A1B-Q8_0-GGUF

MiniCPM4.1-8B-Q8_0-GGUF

NikolayKozloff/MiniCPM4.1-8B-Q80-GGUF This model was converted to GGUF format from `openbmb/MiniCPM4.1-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

aya-23-8B-q4_0-q5_0-GGUF

DeepSeek-R1-Distill-Qwen-14B-Q5_K_M-GGUF

Mistral-Nemo-Instruct-2407-Q8_0-GGUF

YuLan-Mini-Q8_0-GGUF

gemma-3-1b-it-Q8_0-GGUF

Hermes-4-14B-Q4_K_M-GGUF

NikolayKozloff/Hermes-4-14B-Q4KM-GGUF This model was converted to GGUF format from `NousResearch/Hermes-4-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

gemma-3-270m-Q8_0-GGUF

NikolayKozloff/gemma-3-270m-Q80-GGUF This model was converted to GGUF format from `google/gemma-3-270m` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

GigaChat-20B-A3B-instruct-Q4_0-GGUF

NikolayKozloff/GigaChat-20B-A3B-instruct-Q40-GGUF This model was converted to GGUF format from `ai-sage/GigaChat-20B-A3B-instruct` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Phi-SoSerious-Mini-V1-Q8_0-Q6_K-Q5_K_M-Q4_0-GGUF

medgemma-4b-it-Q8_0-GGUF

NikolayKozloff/medgemma-4b-it-Q80-GGUF This model was converted to GGUF format from `google/medgemma-4b-it` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

gemma-portuguese-luana-2b-GGUF

saiga_nemo_12b-Q5_K_M-GGUF

JanusCoder-8B-Q8_0-GGUF

NikolayKozloff/JanusCoder-8B-Q80-GGUF This model was converted to GGUF format from `internlm/JanusCoder-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

ERNIE-4.5-21B-A3B-Thinking-Q3_K_M-GGUF

NikolayKozloff/ERNIE-4.5-21B-A3B-Thinking-Q3KM-GGUF This model was converted to GGUF format from `baidu/ERNIE-4.5-21B-A3B-Thinking` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

tora-code-13b-v1.0

pip-code-bandit-Q8_0-GGUF

codeLLAMA modified archi

Vikhr-Llama-3.2-1B-Instruct-Q8_0-GGUF

gemma-2-2b-it-Q8_0-GGUF

JanusCoder-14B-Q5_K_M-GGUF

NikolayKozloff/JanusCoder-14B-Q5KM-GGUF This model was converted to GGUF format from `internlm/JanusCoder-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

JanusCoder-14B-Q4_K_S-GGUF

suzume-llama-3-8B-multilingual-Q6_K-GGUF

Hunyuan-MT-Chimera-7B-Q8_0-GGUF

NikolayKozloff/Hunyuan-MT-Chimera-7B-Q80-GGUF This model was converted to GGUF format from `tencent/Hunyuan-MT-Chimera-7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen2-7B-Q6_K-GGUF

Hermes-4-14B-Q5_K_M-GGUF

NikolayKozloff/Hermes-4-14B-Q5KM-GGUF This model was converted to GGUF format from `NousResearch/Hermes-4-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

DeepSeek-Prover-V2-7B-Q8_0-GGUF

NikolayKozloff/DeepSeek-Prover-V2-7B-Q80-GGUF This model was converted to GGUF format from `deepseek-ai/DeepSeek-Prover-V2-7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Czech-GPT-2-XL-133k-GGUF

madlad400-3b-mt-Q8_0-GGUF

amoral-gemma3-4B-Q8_0-GGUF

pip-library-etl-1.3b-Q8_0-GGUF

codeLLAMA modified archi

Replete-Coder-Llama3-8B-Q5_0-GGUF

Gemma-2-9B-It-SPPO-Iter3-Q5_0-GGUF

GigaChat-20B-A3B-instruct-Q3_K_M-GGUF

gemma-3-12b-it-Q5_K_M-GGUF

JanusCoder-14B-Q5_K_S-GGUF

NikolayKozloff/JanusCoder-14B-Q5KS-GGUF This model was converted to GGUF format from `internlm/JanusCoder-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

tora-code-7b-v1.0

mGPT-1.3B-mari-GGUF

Qwen2-Math-7B-Instruct-Q8_0-GGUF

salt-asr_wav-uni_1_tts_wav-uni_1-12k-Q8_0-GGUF

gemma-3-12b-it-Q6_K-GGUF

Llama-3-8B-Instruct-Coder-Q8_0-GGUF

granite-3b-code-instruct-Q8_0-GGUF

Tesser-Llama-3-Ko-8B-Q4_0-GGUF

GemmaCoder3-12B-Q5_K_M-GGUF

Qwen2-7B-Q8_0-GGUF

PLLuM-12B-instruct-Q5_K_M-GGUF

Qwen2-7B-Instruct-Q4_0-GGUF

Qwen2-7B-Instruct-deccp-Q8_0-GGUF

Replete-Coder-Qwen2-1.5b-Q4_0-GGUF

Gemma-2-9B-It-SPPO-Iter3-Q4_K_S-GGUF

gemma-2-2b-jpn-it-Q8_0-GGUF

DeepSeek-R1-Distill-Llama-8B-Q8_0-GGUF

DeepSeek-R1-Distill-Qwen-7B-Q8_0-GGUF

Vikhr-Gemma-2B-instruct-Q8_0-GGUF

Llama-3-8B-Swedish-Norwegian-Danish-checkpoint-16000-11_6_2024-Q8_0-GGUF

EuroLLM-1.7B-Q8_0-GGUF

LFM2-2.6B-Q8_0-GGUF

NikolayKozloff/LFM2-2.6B-Q80-GGUF This model was converted to GGUF format from `LiquidAI/LFM2-2.6B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

SeaPhi3-mini-Q6_K-GGUF

Gromenauer-7B-Instruct-Q8_0-GGUF

Viking-7B-Q5_K_M-GGUF

deepthought-8b-llama-v0.01-alpha-Q8_0-GGUF

granite-20b-code-instruct-Q4_0-GGUF

Qwen2-7B-Q5_K_S-GGUF

Qwen2-7B-Q4_0-GGUF

Replete-Coder-Llama3-8B-Q4_0-GGUF

Phi-3-mini-4k-instruct-sq-LORA-F32-GGUF

Dolphin3.0-Qwen2.5-3b-Q8_0-GGUF

JanusCoder-14B-Q4_K_M-GGUF

NikolayKozloff/JanusCoder-14B-Q4KM-GGUF This model was converted to GGUF format from `internlm/JanusCoder-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

kappa-3-phi-abliterated-Q8_0-GGUF

Qwen2-7B-Q5_K_M-GGUF

Mistral-portuguese-luana-7b-Mathematics-GGUF

britllm-3b-v0.1-Q8_0-GGUF

RoLlama3-8b-Instruct-Q8_0-GGUF

Gemma-2-9B-It-SPPO-Iter3-Q5_K_S-GGUF

madlad400-10b-mt-Q6_K-GGUF

WizardLM-2-7B-abliterated-Q4_0-GGUF

Mistral-Small-24B-Instruct-2501-Q2_K-GGUF

NikolayKozloff/Mistral-Small-24B-Instruct-2501-Q2K-GGUF This model was converted to GGUF format from `mistralai/Mistral-Small-24B-Instruct-2501` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Llama-portuguese-13b-Luana-v0.2-GGUF

Falcon2-5.5B-Dutch-Q4_0-GGUF

Viking-7B-Q6_K-GGUF

Gemma-2-9B-It-SPPO-Iter3-IQ4_NL-GGUF

Gemma-2-9B-It-SPPO-Iter3-Q4_0-GGUF

SauerkrautLM-Nemo-12b-Instruct-Q5_K_S-GGUF

ghost-8b-beta-1608-Q8_0-GGUF

jais-13b-chat-Q2_K-GGUF

DeepSeek-R1-Distill-Qwen-14B-Q5_K_S-GGUF

amoral-gemma3-12B-Q5_K_M-GGUF

Llama-2-7b-Ukrainian-Q8_0-GGUF

LLaMA-Mesh-Q8_0-GGUF

Gemma-2-9B-It-SPPO-Iter3-Q8_0-GGUF

Falcon2-5.5B-Italian-Q8_0-GGUF

EuroLLM-9B-Q8_0-GGUF

cogito-v1-preview-llama-8B-Q8_0-GGUF

Qwen3-14B-Q4_K_M-GGUF

Hunyuan-0.5B-Instruct-Q8_0-GGUF

SauerkrautLM-7b-v1-mistral

Llama-2-13b-Romanian-GGUF

SauerkrautLM-Qwen-32b-Q3_K_S-GGUF

Yi-1.5-6B-Chat-Q4_K_M-GGUF

Qwen3-1.7B-abliterated-Q8_0-GGUF

mGPT-1.3B-tajik-GGUF

Qwen2-7B-Q4_K_S-GGUF

Replete-Coder-Qwen2-1.5b-Q5_0-GGUF

Llasa-3B-Q8_0-GGUF

granite-3.2-8b-instruct-preview-Q8_0-GGUF

reka-flash-3-Q3_K_S-GGUF

cogito-v1-preview-qwen-14B-Q4_K_M-GGUF

NextCoder-14B-Q4_K_S-GGUF

NikolayKozloff/NextCoder-14B-Q4KS-GGUF This model was converted to GGUF format from `microsoft/NextCoder-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

YanoljaNEXT-Rosetta-20B-Q2_K-GGUF

NikolayKozloff/YanoljaNEXT-Rosetta-20B-Q2K-GGUF This model was converted to GGUF format from `yanolja/YanoljaNEXT-Rosetta-20B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Llama-3-8B-Swedish-Norwegian-Danish-Q8_0-GGUF

DeepSeek-R1-Distill-Qwen-7B-Multilingual-Q8_0-GGUF

mGPT-1.3B-armenian-GGUF

EVA-GPT-German-Q6_K-GGUF

magnum-12b-v2.5-kto-Q5_K_M-GGUF

Hebrew-Gemma-11B-V2-Q6_K-GGUF

Phi-3-mini-4k-instruct-dansk-Q8_0-GGUF

Phi-3-medium-4k-instruct-Q6_K-GGUF

Qwen2-7B-Q5_0-GGUF

tabula-8b-Q5_0-GGUF

RoGemma-7b-Instruct-Q5_0-GGUF

Mistral-Nemo-Instruct-2407-Q5_K_S-GGUF

granite-3.0-8b-instruct-Q8_0-GGUF

BgGPT-Gemma-2-2.6B-IT-v1.0-Q8_0-GGUF

lb-reranker-0.5B-v1.0-Q8_0-GGUF

Dans-PersonalityEngine-V1.3.0-12b-Q5_K_M-GGUF

NikolayKozloff/Dans-PersonalityEngine-V1.3.0-12b-Q5KM-GGUF This model was converted to GGUF format from `PocketDoc/Dans-PersonalityEngine-V1.3.0-12b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Dhanishtha-2.0-preview-Q4_K_S-GGUF

gemma-3-4b-it-shqip-v3-Q8_0-GGUF

DeepSeek-R1-Distill-Qwen-1.5B-Multilingual-Q8_0-GGUF

NikolayKozloff/DeepSeek-R1-Distill-Qwen-1.5B-Multilingual-Q80-GGUF This model was converted to GGUF format from `lightblue/DeepSeek-R1-Distill-Qwen-1.5B-Multilingual` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

helium-1-2b-Q8_0-GGUF

NikolayKozloff/helium-1-2b-Q80-GGUF This model was converted to GGUF format from `kyutai/helium-1-2b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

L3-8B-Lunaris-v1-IQ4_NL-GGUF

Hermes-3-Llama-3.2-3B-Q8_0-GGUF

Llama-3

Hunyuan-1.8B-Instruct-Q8_0-GGUF

granite-20b-code-base-Q3_K_L-GGUF

strela-Q5_0-GGUF

SauerkrautLM-Gemma-2b-Q8_0-GGUF

GermanEduScorer-Qwen2-1.5b-Q8_0-GGUF

Viking-7B-Q5_K_S-GGUF

Llama-3-Instruct-Neurona-8b-v2-Q4_0-GGUF

magnum-12b-v2.5-kto-Q6_K-GGUF

saiga_nemo_12b-Q6_K-GGUF

Pensez-v0.1-e5-Q8_0-GGUF

gemma-3-12b-it-Q8_0-GGUF

Llasa-1B-Q8_0-GGUF

NikolayKozloff/Llasa-1B-Q80-GGUF This model was converted to GGUF format from `HKUST-Audio/Llasa-1B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

bleta-8B-v0.5-Albanian-shqip-GGUF

Phi-3-medium-4k-instruct-Q5_K_S-GGUF

GigaChat-20B-A3B-instruct-Q3_K_L-GGUF

q1-3B-PRIME-Q8_0-GGUF

mGPT-1.3B-uzbek-GGUF

polanka-qwen2-3b-v0.1-Q8_0-GGUF

Qwen2-1.5B-Ita-Q8_0-GGUF

Replete-Coder-Llama3-8B-IQ4_NL-GGUF

NuminaMath-7B-TIR-IQ4_NL-GGUF

it-5.4-fp16-orpo-v2-Q8_0-GGUF

Tiger-Gemma-9B-v1-Q4_0-GGUF

WoonaV1.2-9b-Q8_0-GGUF

jais-13b-chat-Q5_K_M-GGUF

Phi-4-reasoning-Q5_K_S-GGUF

AceReason-Nemotron-14B-Q5_K_S-GGUF

NextCoder-7B-Q8_0-GGUF

CodeQwen1.5-7B-Q8_0-GGUF

OpenCoder-8B-Instruct-Q5_K_M-GGUF

Llama-3-8B-Swedish-Norwegian-Danish-chekpoint-18833-1-epoch-15_6_2024-Q8_0-GGUF

gemma2-9B-sunfall-v0.5-Q8_0-GGUF

Qwen-portuguese-luana-7b-GGUF

Llama-3-8B-instruct-dansk-Q8_0-GGUF

Dorna-Llama3-8B-Instruct-IQ4_NL-GGUF

llama3-turbcat-instruct-8b-IQ4_NL-GGUF

Llama-3SOME-8B-v2-Q4_K_S-GGUF

Replete-Coder-Qwen2-1.5b-Q8_0-GGUF

gemma-2-9b-Q8_0-GGUF

Qwen2-1.5B-ITA-Instruct-Q8_0-GGUF

Tiger-Gemma-9B-v1-IQ4_NL-GGUF

Mistral-Nemo-Instruct-2407-Q5_K_M-GGUF

SauerkrautLM-Nemo-12b-Instruct-Q5_K_M-GGUF

Viking-Magnum-v0.1-7B-Q8_0-GGUF

BgGPT-Gemma-2-27B-IT-v1.0-Q2_K-GGUF

EXAONE-3.5-2.4B-Instruct-Q8_0-GGUF

granite-3.1-8b-instruct-Q8_0-GGUF

EXAONE-Deep-2.4B-Q8_0-GGUF

Dhanishtha-2.0-preview-Q5_K_S-GGUF

NikolayKozloff/Dhanishtha-2.0-preview-Q5KS-GGUF This model was converted to GGUF format from `HelpingAI/Dhanishtha-2.0-preview` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

MAmmoTH-Coder-7B-GGUF

granite-8b-code-base-Q8_0-GGUF

llama3_8b_chat_brainstorm-Q6_K-GGUF

granite-8b-code-instruct-Q6_K-GGUF

Mistral-Nemo-Kurdish-Q6_K-GGUF

phi-4-Q4_K_M-GGUF

ArmenianGPT-0.5-12B-Q8_0-GGUF

NikolayKozloff/ArmenianGPT-0.5-12B-Q80-GGUF This model was converted to GGUF format from `ArmGPT/ArmenianGPT-0.5-12B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Mixtral_AI_CyberTron_Swahili_7b-GGUF

llama-3-typhoon-v1.5-8b-instruct-Q6_K-GGUF

internlm2-math-plus-20b-Q4_0-GGUF

llama3-tweety-8b-italian-Q4_0-GGUF

L3-8B-Lunaris-v1-Q4_0-GGUF

gemma-2-27b-Q2_K-GGUF

RoGemma-7b-Instruct-Q4_K_L-GGUF

Replete-Coder-Instruct-8b-Merged-Q8_0-GGUF

Tiger-Gemma-9B-v1-Q8_0-GGUF

mistral-doryV2-12b-Q8_0-GGUF

OmniLing-V1-8b-experimental-Q8_0-GGUF

Phi-3-medium-4k-instruct-sq-LORA-F16-GGUF

Phi-3-medium-4k-instruct-sq-LORA-Q8_0-GGUF

OpenCoder-8B-Instruct-Q8_0-GGUF

GigaChat-20B-A3B-instruct-Q2_K-GGUF

cogito-v1-preview-qwen-14B-Q4_K_S-GGUF

Confucius3-Math-Q5_K_S-GGUF

Datarus-R1-14B-preview-Q4_K_M-GGUF

NikolayKozloff/Datarus-R1-14B-preview-Q4KM-GGUF This model was converted to GGUF format from `DatarusAI/Datarus-R1-14B-preview` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

YanoljaNEXT-Rosetta-12B-Q8_0-GGUF

Llama3-DocChat-1.0-8B-Q8_0-GGUF

SambaLingo-Hungarian-Chat-GGUF

Ko-Qwen2-7B-Instruct-Q8_0-GGUF

Replete-LLM-Qwen2-7b_Beta-Preview-Q8_0-GGUF

Mistral-NeMo-Minitron-8B-Base-Q8_0-GGUF

Muyan-TTS-Q8_0-GGUF

suzume-llama-3-8B-multilingual-orpo-borda-full-Q8_0-GGUF

Qwen2-7B-Instruct-Q5_K_S-GGUF

leniachat-qwen2-1.5B-v0-Q8_0-GGUF

LLAMA-3_8B_Unaligned_Alpha-Q8_0-GGUF

tabula-8b-IQ4_NL-GGUF

Gemma-2-9B-It-SPPO-Iter3-IQ4_XS-GGUF

RoGemma-7b-Instruct-Q4_0-GGUF

RoGemma-7b-Instruct-Q6_K_L-GGUF

RoGemma-7b-Instruct-Q5_K_L-GGUF

Viking-13B-Q4_0-GGUF

Llama-3-Instruct-Neurona-8b-v2-Q5_0-GGUF

RoLlama3-8b-Instruct-Q4_K_L-GGUF

MegaBeam-Mistral-7B-512k-Q8_0-GGUF

Llama-3.1-SauerkrautLM-8b-Instruct-Q8_0-GGUF

Bielik-11B-v2.3-Instruct-Q5_K_M-GGUF

Mistral-Nemo-Instruct-bellman-12b-Q5_K_M-GGUF

GigaChat-20B-A3B-instruct-Q4_K_S-GGUF

AceInstruct-7B-Q8_0-GGUF

amoral-gemma3-12B-Q8_0-GGUF

NVIDIA-Nemotron-Nano-12B-v2-Q5_K_M-GGUF

NikolayKozloff/NVIDIA-Nemotron-Nano-12B-v2-Q5KM-GGUF This model was converted to GGUF format from `nvidia/NVIDIA-Nemotron-Nano-12B-v2` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Hermes-4-14B-Q5_K_S-GGUF

NikolayKozloff/Hermes-4-14B-Q5KS-GGUF This model was converted to GGUF format from `NousResearch/Hermes-4-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Llama-3.1-Minitron-4B-Width-Base-Q8_0-GGUF

PULI-LlumiX-32K-GGUF

Meltemi-7B-v1-GGUF

Aura-Llama-Abliterated-Q8_0-GGUF

LongWriter-llama3.1-8b-Q8_0-GGUF

Llama-3.1-8B-Instruct-Reasoner-1o1_v0.3-Q8_0-GGUF

NVIDIA-Nemotron-Nano-12B-v2-Q6_K-GGUF

NikolayKozloff/NVIDIA-Nemotron-Nano-12B-v2-Q6K-GGUF This model was converted to GGUF format from `nvidia/NVIDIA-Nemotron-Nano-12B-v2` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-7b-finnish-v2-Q8_0-GGUF

Llama-3-KafkaLM-8B-v0.1-Q8_0-GGUF

Yi-1.5-9B-Chat-Q4_K_M-GGUF

tyr-Q8_0-GGUF

Llama-3-Instruct-8B-SimPO-Q5_0-GGUF

AlchemistCoder-DS-6.7B-Q5_0-GGUF

Llama-3-Oasis-v1-OAS-8B-Q4_0-GGUF

Llama-3-neoAI-8B-Chat-v0.1-Q4_0-GGUF

Llama-3-Instruct-Neurona-8b-v2-IQ4_NL-GGUF

SeaLLM3-7B-Chat-Q8_0-GGUF

ArliAI-Llama-3-8B-Formax-v1.0-Q5_0-GGUF

ArliAI-Llama-3-8B-Formax-v1.0-IQ4_NL-GGUF

NuminaMath-7B-TIR-Q8_0-GGUF

SauerkrautLM-Nemo-12b-Instruct-Q8_0-GGUF

SauerkrautLM-Nemo-12b-Instruct-Q6_K-GGUF

falcon-mamba-7b-Q8_0-GGUF

Viking-SlimSonnet-v1-7B-Q8_0-GGUF

OpenCoder-8B-Instruct-Q6_K-GGUF

BgGPT-Gemma-2-9B-IT-v1.0-Q8_0-GGUF

amoral-gemma3-12B-Q6_K-GGUF

Qwen3-14B-Q5_K_S-GGUF

NextCoder-14B-Q4_K_M-GGUF

NikolayKozloff/NextCoder-14B-Q4KM-GGUF This model was converted to GGUF format from `microsoft/NextCoder-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

YanoljaNEXT-Rosetta-12B-2510-Q5_K_S-GGUF

DeepSeek-R1-ReDistill-Qwen-7B-v1.1-Q8_0-GGUF

Seed-X-Instruct-7B-Q8_0-GGUF

Diver-Retriever-4B-Q8_0-GGUF

Llama-3-8b-ita-ties-Q8_0-GGUF

Phi-3-medium-128k-instruct-Q4_0-GGUF

Llama-3-Instruct-8B-SimPO-Q4_0-GGUF

shotor-Q8_0-GGUF

Llama-3-Instruct-8B-SPPO-Iter3-IQ4_NL-GGUF

Arcee-Spark-FP32-Q8_0-GGUF

Viking-7B-Q4_0-GGUF

gemma-2-9b-it-Q8_0-GGUF

RoLlama3-8b-Instruct-Q8_0_L-GGUF

gemma2-9B-daybreak-v0.5-Q8_0-GGUF

Einstein-v7-Qwen2-7B-Q8_0-GGUF

Gemma-2-9b-indic-Q8_0-GGUF

Mistral-Nemo-Instruct-2407-Q6_K-GGUF

Replete-LLM-V2.5-Qwen-1.5b-Q8_0-GGUF

Replete-LLM-V2.5-Qwen-32b-Q3_K_S-GGUF

Mistral-Nemo-Kurdish-Instruct-Q5_K_S-GGUF

qwen2.5-7b-ins-v3-Q8_0-GGUF

phi-4-Q5_K_S-GGUF

zeta-Q8_0-GGUF

Meta-Llama-3.1-8B-SurviveV3-Q8_0-GGUF

NikolayKozloff/Meta-Llama-3.1-8B-SurviveV3-Q80-GGUF This model was converted to GGUF format from `lolzinventor/Meta-Llama-3.1-8B-SurviveV3` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-0.6B-Q8_0-GGUF

AceReason-Nemotron-1.1-7B-Q8_0-GGUF

NikolayKozloff/AceReason-Nemotron-1.1-7B-Q80-GGUF This model was converted to GGUF format from `nvidia/AceReason-Nemotron-1.1-7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

OpenCodeReasoning-Nemotron-1.1-7B-Q8_0-GGUF

NikolayKozloff/OpenCodeReasoning-Nemotron-1.1-7B-Q80-GGUF This model was converted to GGUF format from `nvidia/OpenCodeReasoning-Nemotron-1.1-7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

OpenCodeReasoning-Nemotron-1.1-14B-Q4_K_M-GGUF

NikolayKozloff/OpenCodeReasoning-Nemotron-1.1-14B-Q4KM-GGUF This model was converted to GGUF format from `nvidia/OpenCodeReasoning-Nemotron-1.1-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

OpenReasoning-Nemotron-7B-Q8_0-GGUF

NikolayKozloff/OpenReasoning-Nemotron-7B-Q80-GGUF This model was converted to GGUF format from `nvidia/OpenReasoning-Nemotron-7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Piaget-1.7B-Q8_0-GGUF

NikolayKozloff/Piaget-1.7B-Q80-GGUF This model was converted to GGUF format from `gustavecortal/Piaget-1.7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

YanoljaNEXT-Rosetta-12B-Q4_K_M-GGUF

tora-13b-v1.0

h2o-danube3-500m-base-Q8_0-GGUF

Sailor-7B-Q8_0-GGUF

Nxcode-CQ-7B-orpo-Q6_K-GGUF

Falcon2-5.5B-Swedish-Q8_0-GGUF

Alphacode-MALI-9B-Q8_0-GGUF

Phi-3-medium-4k-instruct-Q4_0-GGUF

Phi-3-medium-128k-instruct-Q5_K_S-GGUF

Awanllm-Llama-3-8B-Cumulus-v0.3.2-Q5_0-GGUF

AlchemistCoder-DS-6.7B-Q4_0-GGUF

Llama-3-Steerpike-v1-OAS-8B-Q5_0-GGUF

h2o-Llama-3-8B-Japanese-Instruct-Q8_0-GGUF

Llama-3-Instruct-8B-SPPO-Iter3-Q4_0-GGUF

Turkish-Llama-8b-Instruct-v0.1-IQ4_NL-GGUF

Llama

Viking-7B-Q8_0-GGUF

RoGemma-7b-Instruct-Q8_0-GGUF

RoLlama3-8b-Instruct-Q5_K_L-GGUF

Viking-13B-Q4_K_M-GGUF

ParaLex-Llama-3-8B-SFT-Q8_0-GGUF

ArliAI-Llama-3-8B-Formax-v1.0-Q4_0-GGUF

mathstral-7B-v0.1-Q8_0-GGUF

falcon-mamba-7b-instruct-Q8_0-GGUF

jais-13b-chat-Q3_K_L-GGUF

Mistral-Small-Instruct-2409-Q3_K_L-GGUF

Mistral-Small-Instruct-2409-Q2_K-GGUF

polanka-qwen2-1.5b-v0.1-ckpt_401000-Q8_0-GGUF

Replete-LLM-V2.5-Qwen-14b-Q5_K_M-GGUF

OpenCoder-1.5B-Instruct-Q8_0-GGUF

FuseChat-Qwen-2.5-7B-Instruct-Q8_0-GGUF

NikolayKozloff/FuseChat-Qwen-2.5-7B-Instruct-Q80-GGUF This model was converted to GGUF format from `FuseAI/FuseChat-Qwen-2.5-7B-Instruct` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Human-Like-Mistral-Nemo-Instruct-2407-Q6_K-GGUF

GLM-Z1-9B-0414-Q8_0-GGUF

Polaris-4B-Preview-Q8_0-GGUF

OpenCodeReasoning-Nemotron-1.1-14B-Q5_K_S-GGUF

NikolayKozloff/OpenCodeReasoning-Nemotron-1.1-14B-Q5KS-GGUF This model was converted to GGUF format from `nvidia/OpenCodeReasoning-Nemotron-1.1-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

MiniCPM4.1-8B-Q5_K_S-GGUF

NikolayKozloff/MiniCPM4.1-8B-Q5KS-GGUF This model was converted to GGUF format from `openbmb/MiniCPM4.1-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

ArmenianGPT-0.5-12B-Q4_K_M-GGUF

NikolayKozloff/ArmenianGPT-0.5-12B-Q4KM-GGUF This model was converted to GGUF format from `ArmGPT/ArmenianGPT-0.5-12B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

SauerkrautLM-3b-v1

SauerkrautLM-13b-v1

SFR-SFT-LLaMA-3-8B-R-Q8_0-GGUF

SFR-Iterative-DPO-LLaMA-3-8B-R-Q8_0-GGUF

Llama-3.1-Hawkish-8B-Q8_0-GGUF

NikolayKozloff/Llama-3.1-Hawkish-8B-Q80-GGUF This model was converted to GGUF format from `mukaj/Llama-3.1-Hawkish-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-3.1

Llama-3-8B-dutch-GGUF

llama 3 8B

llama-3-llamilitary-Q8_0-GGUF

Meta-Llama-3.1-8B-Instruct-Q8_0-GGUF

granite-3b-code-base-Q8_0-GGUF

AutoCoder_S_6.7B-Q8_0-GGUF

Irbis-7b-v0.1-Kazakh-Q8_0-GGUF

bella-1-8b-Q8_0-GGUF

Mistral-Nemo-12B-ArliAI-RPMax-v1.1-Q5_K_M-GGUF

tora-7b-v1.0

SambaLingo-Bulgarian-Chat-GGUF

Llama-3-portuguese-Tom-cat-8b-instruct-Q6_K-GGUF

openchat-3.6-8b-20240522-Q8_0-GGUF

L3-Aethora-15B-Q5_K_S-GGUF

L3-Aethora-15B-Q6_K-GGUF

L3-Aethora-15B-Q5_0-GGUF

Ko-Llama-3-8B-Instruct-Q8_0-GGUF

Tiger-Gemma-9B-v1-Q5_0-GGUF

Meta-Llama-3.1-8B-Q8_0-GGUF

mistral-doryV2-12b-Q6_K-GGUF

Llama-3.1-Minitron-4B-Depth-Base-Q8_0-GGUF

Viking-SlimSonnet-v0.2-7B-Q8_0-GGUF

ChatFrame-Instruct-Persian-Small-Q8_0-GGUF

pansophic-1-preview-LLaMA3.1-8b-Q8_0-GGUF

MagpieLM-4B-Chat-v0.1-Q8_0-GGUF

Replete-LLM-V2.5-Qwen-7b-Q8_0-GGUF

Replete-LLM-V2.5-Qwen-0.5b-Q8_0-GGUF

Llama-eus-8B-Q8_0-GGUF

OpenCoder-1.5B-Base-Q8_0-GGUF

FuseChat-Gemma-2-9B-Instruct-Q8_0-GGUF

NikolayKozloff/FuseChat-Gemma-2-9B-Instruct-Q80-GGUF This model was converted to GGUF format from `FuseAI/FuseChat-Gemma-2-9B-Instruct` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

DeepSeek-R1-Distill-Qwen-14B-Multilingual-Q5_K_S-GGUF

Phi-4-reasoning-plus-Q4_K_M-GGUF

NikolayKozloff/Phi-4-reasoning-plus-Q4KM-GGUF This model was converted to GGUF format from `microsoft/Phi-4-reasoning-plus` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

ERNIE-4.5-0.3B-PT-Q8_0-GGUF

NikolayKozloff/ERNIE-4.5-0.3B-PT-Q80-GGUF This model was converted to GGUF format from `baidu/ERNIE-4.5-0.3B-PT` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Dhanishtha-2.0-preview-Q4_K_M-GGUF

OpenCodeReasoning-Nemotron-1.1-14B-Q4_K_S-GGUF

NikolayKozloff/OpenCodeReasoning-Nemotron-1.1-14B-Q4KS-GGUF This model was converted to GGUF format from `nvidia/OpenCodeReasoning-Nemotron-1.1-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

ArmenianGPT-0.1-12B-Q5_K_M-GGUF

NikolayKozloff/ArmenianGPT-0.1-12B-Q5KM-GGUF This model was converted to GGUF format from `ArmGPT/ArmenianGPT-0.1-12B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

ERNIE-4.5-21B-A3B-Thinking-Q3_K_S-GGUF

NikolayKozloff/ERNIE-4.5-21B-A3B-Thinking-Q3KS-GGUF This model was converted to GGUF format from `baidu/ERNIE-4.5-21B-A3B-Thinking` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

YandexGPT-5-Lite-8B-pretrain-Q8_0-GGUF

L3-Aethora-15B-V2-Q5_K_S-GGUF

Selene-1-Mini-Llama-3.1-8B-Q6_K-GGUF

Chocolatine-8B-Instruct-DPO-v1.0-Q8_0-GGUF

L3-Aethora-15B-V2-Q4_K_M-GGUF

L3-8B-Everything-COT-Q8_0-GGUF

L3-8B-Celeste-V1.2-Q8_0-GGUF

llama-3-Nephilim-v3-8B-Q8_0-GGUF

orcapaca_albanian-Q5_K_M-GGUF

NightyGurps-12b-v1-experimental-Q8_0-GGUF

Dans-PersonalityEngine-V1.3.0-12b-Q6_K-GGUF

NikolayKozloff/Dans-PersonalityEngine-V1.3.0-12b-Q6K-GGUF This model was converted to GGUF format from `PocketDoc/Dans-PersonalityEngine-V1.3.0-12b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

ALMA-7B-GGUF

Heidrun-Mistral-7B-chat-Q8_0-GGUF

MATH-BG-v1-7B-GGUF

dictalm2.0-instruct-Q6_K-GGUF

malaysian-llama-3-8b-instruct-16k-Q8_0-GGUF

EVA-GPT-German-v7-2-Beta-Q5_K_M-GGUF

shisa-v1-llama3-8b-Q8_0-GGUF

Awanllm-Llama-3-8B-Cumulus-v0.3.2-Q4_0-GGUF

Llama3-German-8B-Q8_0-GGUF

Llama3-DiscoLeo-Instruct-8B-v0.1-Q8_0-GGUF

Llama3-DiscoLeo-Instruct-8B-32k-v0.1-Q8_0-GGUF

suzume-llama-3-8B-multilingual-orpo-borda-half-Q5_K_M-GGUF

suzume-llama-3-8B-multilingual-orpo-borda-top75-Q8_0-GGUF

Llama-3-11.5B-V2-Q4_0-GGUF

Llama-3-11.5B-Instruct-V2-Q5_0-GGUF

Llama-3-Ultron-Q8_0-GGUF

Awanllm-Llama-3-8B-Cumulus-v1.0-Q4_0-GGUF

Awanllm-Llama-3-8B-Cumulus-v1.0-Q5_0-GGUF

Llama-3-8B-instruct-Swedish-Norwegian-Danish-Q8_0-GGUF

Llama-3-8B-Swedish-Norwegian-Danish-checkpoint-11525-03_6_2024-Q8_0-GGUF

L3-Aethora-15B-Q4_K_S-GGUF

L3-Aethora-15B-Q5_K_M-GGUF

L3-Aethora-15B-Q8_0-GGUF

L3-Aethora-15B-Q4_0-GGUF

Llama-3-8B-Swedish-Norwegian-Danish-checkpoint-14375-08_06_2024-Q8_0-GGUF

Llama-3-Oasis-v1-OAS-8B-Q5_0-GGUF

Llama-3-Steerpike-v1-OAS-8B-Q4_0-GGUF

CataLlama-v0.1-Instruct-SFT-Q8_0-GGUF

CataLlama-v0.1-Instruct-DPO-Q8_0-GGUF

Tesser-Llama-3-Ko-8B-Q5_0-GGUF

Dorna-Llama3-8B-Instruct-IQ4_XS-GGUF

SauerkrautLM-1.5b-Q4_0-GGUF

tabula-8b-Q4_0-GGUF

Morfoz-LLM-8b-v1.0-IQ4_NL-GGUF

Llama-3-Instruct-8B-SPPO-Iter3-Q5_0-GGUF

Llama-3-neoAI-8B-Chat-v0.1-Q5_0-GGUF

Llama-3-neoAI-8B-Chat-v0.1-IQ4_NL-GGUF

RoLlama3-8b-Instruct-Q5_0-GGUF

RoLlama3-8b-Instruct-Q6_K_L-GGUF

Viking-13B-Q5_K_M-GGUF

bella-2-8b-Q8_0-GGUF

ArliAI-Llama-3-8B-Formax-v1.0-IQ4_XS-GGUF

mistral-doryV2-12b-Q5_K_M-GGUF

mistral-doryV2-12b-Q5_K_S-GGUF

L3.1-8B-Celeste-V1.5-Q8_0-GGUF

uzbek-llama-3.1-8B-instruct-v2-Q8_0-GGUF

Duet_Minitron8b_v0.5-Q8_0-GGUF

ChatFrame-Q8_0-GGUF

Mistral-Nemo-12B-ArliAI-RPMax-v1.1-Q6_K-GGUF

Llama-3.1-8B-ArliAI-RPMax-v1.1-Q8_0-GGUF

jais-13b-chat-Q3_K_S-GGUF

MagpieLM-8B-SFT-v0.1-Q8_0-GGUF

Replete-LLM-V2.5-Qwen-14b-Q5_K_S-GGUF

Replete-LLM-V2.5-Qwen-32b-Q4_K_M-GGUF

Mistral-NeMo-Minitron-8B-Instruct-Q8_0-GGUF

FastApply-1.5B-v1.0-Q6_K-GGUF

OpenCoder-8B-Instruct-Q5_K_S-GGUF

Teuken-7B-instruct-research-v0.4-Q8_0-GGUF

Llama-3-ChocoLlama-8B-instruct-Q8_0-GGUF

SauerkrautLM-v2-14b-DPO-Q5_K_M-GGUF

NikolayKozloff/SauerkrautLM-v2-14b-DPO-Q5KM-GGUF This model was converted to GGUF format from `VAGOsolutions/SauerkrautLM-v2-14b-DPO` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

OREAL-7B-Q8_0-GGUF

Qwen3-14B-Q4_K_S-GGUF

Phi-4-reasoning-Q4_K_S-GGUF

NikolayKozloff/Phi-4-reasoning-Q4KS-GGUF This model was converted to GGUF format from `microsoft/Phi-4-reasoning` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Llama-3.1-Nemotron-Nano-4B-v1.1-Q8_0-GGUF

NikolayKozloff/Llama-3.1-Nemotron-Nano-4B-v1.1-Q80-GGUF This model was converted to GGUF format from `nvidia/Llama-3.1-Nemotron-Nano-4B-v1.1` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

OpenReasoning-Nemotron-14B-Q5_K_S-GGUF

NikolayKozloff/OpenReasoning-Nemotron-14B-Q5KS-GGUF This model was converted to GGUF format from `nvidia/OpenReasoning-Nemotron-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

OpenReasoning-Nemotron-14B-Q4_K_S-GGUF

NikolayKozloff/OpenReasoning-Nemotron-14B-Q4KS-GGUF This model was converted to GGUF format from `nvidia/OpenReasoning-Nemotron-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Hunyuan-4B-Instruct-Q8_0-GGUF

Goedel-Prover-V2-8B-Q8_0-GGUF

NikolayKozloff/Goedel-Prover-V2-8B-Q80-GGUF This model was converted to GGUF format from `Goedel-LM/Goedel-Prover-V2-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

HyGPT-10b-it-Q8_0-GGUF

silly-v0.2-Q6_K-GGUF

NikolayKozloff/silly-v0.2-Q6K-GGUF This model was converted to GGUF format from `wave-on-discord/silly-v0.2` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

MiniCPM4.1-8B-Q5_K_M-GGUF

NikolayKozloff/MiniCPM4.1-8B-Q5KM-GGUF This model was converted to GGUF format from `openbmb/MiniCPM4.1-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).