NikolayKozloff
Meta-Llama-3-8B-Instruct-bf16-correct-pre-tokenizer-and-EOS-token-Q8_0-Q6_k-Q4_K_M-GGUF
jais-13b-chat-Q4_K_M-GGUF
DeepSeek-R1-Distill-Qwen-14B-Q4_K_M-GGUF
UserLM-8b-Q8_0-GGUF
OpenReasoning-Nemotron-14B-Q4_K_M-GGUF
NikolayKozloff/OpenReasoning-Nemotron-14B-Q4KM-GGUF This model was converted to GGUF format from `nvidia/OpenReasoning-Nemotron-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
gpt-oss-20b-uncensored-bf16-Q4_K_M-GGUF
NikolayKozloff/gpt-oss-20b-uncensored-bf16-Q4KM-GGUF This model was converted to GGUF format from `huizimao/gpt-oss-20b-uncensored-bf16` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Gpt Oss 6.0b Specialized All Pruned Moe Only 7 Experts Q8 0 GGUF
NikolayKozloff/gpt-oss-6.0b-specialized-all-pruned-moe-only-7-experts-Q80-GGUF This model was converted to GGUF format from `AmanPriyanshu/gpt-oss-6.0b-specialized-all-pruned-moe-only-7-experts` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.
Qwen2-7B-Instruct-Q4_K_M-GGUF
DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q8_0-GGUF
YanoljaNEXT-Rosetta-12B-2510-Q6_K-GGUF
gpt-oss-20b-uncensored-bf16-Q2_K-GGUF
NikolayKozloff/gpt-oss-20b-uncensored-bf16-Q2K-GGUF This model was converted to GGUF format from `huizimao/gpt-oss-20b-uncensored-bf16` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
granite-4.0-1b-Q8_0-GGUF
NikolayKozloff/granite-4.0-1b-Q80-GGUF This model was converted to GGUF format from `ibm-granite/granite-4.0-1b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
granite-4.0-h-1b-Q8_0-GGUF
NikolayKozloff/granite-4.0-h-1b-Q80-GGUF This model was converted to GGUF format from `ibm-granite/granite-4.0-h-1b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
falcon-7b-GGUF
SambaLingo-Russian-Chat-GGUF
YandexGPT-5-Lite-8B-instruct-Q8_0-GGUF
AI21-Jamba-Reasoning-3B-Q8_0-GGUF
NikolayKozloff/AI21-Jamba-Reasoning-3B-Q80-GGUF This model was converted to GGUF format from `ai21labs/AI21-Jamba-Reasoning-3B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
granite-4.0-350m-Q8_0-GGUF
NikolayKozloff/granite-4.0-350m-Q80-GGUF This model was converted to GGUF format from `ibm-granite/granite-4.0-350m` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Dans-PersonalityEngine-V1.3.0-12b-Q5_K_S-GGUF
NikolayKozloff/Dans-PersonalityEngine-V1.3.0-12b-Q5KS-GGUF This model was converted to GGUF format from `PocketDoc/Dans-PersonalityEngine-V1.3.0-12b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Qwen3-8B-Q8_0-GGUF
Llama-3.1-8B-Instruct-abliterated_via_adapter-Q8_0-GGUF
YanoljaNEXT-Rosetta-12B-2510-Q4_K_M-GGUF
Lexi-Llama-3-8B-Uncensored-Q6_K-GGUF
csmpt7b-Czech-GGUF
YanoljaNEXT-Rosetta-12B-2510-Q5_K_M-GGUF
gemma-2-27b-Q3_K_S-GGUF
granite-4.0-h-350m-Q8_0-GGUF
NikolayKozloff/granite-4.0-h-350m-Q80-GGUF This model was converted to GGUF format from `ibm-granite/granite-4.0-h-350m` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
mGPT-1.3B-georgian-GGUF
Qwen2-7B-Q4_K_M-GGUF
aya-expanse-8b-Q8_0-GGUF
NikolayKozloff/aya-expanse-8b-Q80-GGUF This model was converted to GGUF format from `CohereForAI/aya-expanse-8b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
madlad400-10b-mt-Q8_0-GGUF
DeepSeek-R1-Distill-Qwen-1.5B-Q8_0-GGUF
LFM2-8B-A1B-Q8_0-GGUF
MiniCPM4.1-8B-Q8_0-GGUF
NikolayKozloff/MiniCPM4.1-8B-Q80-GGUF This model was converted to GGUF format from `openbmb/MiniCPM4.1-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
aya-23-8B-q4_0-q5_0-GGUF
DeepSeek-R1-Distill-Qwen-14B-Q5_K_M-GGUF
Mistral-Nemo-Instruct-2407-Q8_0-GGUF
YuLan-Mini-Q8_0-GGUF
gemma-3-1b-it-Q8_0-GGUF
Hermes-4-14B-Q4_K_M-GGUF
NikolayKozloff/Hermes-4-14B-Q4KM-GGUF This model was converted to GGUF format from `NousResearch/Hermes-4-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
gemma-3-270m-Q8_0-GGUF
NikolayKozloff/gemma-3-270m-Q80-GGUF This model was converted to GGUF format from `google/gemma-3-270m` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
GigaChat-20B-A3B-instruct-Q4_0-GGUF
NikolayKozloff/GigaChat-20B-A3B-instruct-Q40-GGUF This model was converted to GGUF format from `ai-sage/GigaChat-20B-A3B-instruct` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Phi-SoSerious-Mini-V1-Q8_0-Q6_K-Q5_K_M-Q4_0-GGUF
medgemma-4b-it-Q8_0-GGUF
NikolayKozloff/medgemma-4b-it-Q80-GGUF This model was converted to GGUF format from `google/medgemma-4b-it` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
gemma-portuguese-luana-2b-GGUF
saiga_nemo_12b-Q5_K_M-GGUF
JanusCoder-8B-Q8_0-GGUF
NikolayKozloff/JanusCoder-8B-Q80-GGUF This model was converted to GGUF format from `internlm/JanusCoder-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
ERNIE-4.5-21B-A3B-Thinking-Q3_K_M-GGUF
NikolayKozloff/ERNIE-4.5-21B-A3B-Thinking-Q3KM-GGUF This model was converted to GGUF format from `baidu/ERNIE-4.5-21B-A3B-Thinking` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
tora-code-13b-v1.0
pip-code-bandit-Q8_0-GGUF
Vikhr-Llama-3.2-1B-Instruct-Q8_0-GGUF
gemma-2-2b-it-Q8_0-GGUF
JanusCoder-14B-Q5_K_M-GGUF
NikolayKozloff/JanusCoder-14B-Q5KM-GGUF This model was converted to GGUF format from `internlm/JanusCoder-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
JanusCoder-14B-Q4_K_S-GGUF
suzume-llama-3-8B-multilingual-Q6_K-GGUF
Hunyuan-MT-Chimera-7B-Q8_0-GGUF
NikolayKozloff/Hunyuan-MT-Chimera-7B-Q80-GGUF This model was converted to GGUF format from `tencent/Hunyuan-MT-Chimera-7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Qwen2-7B-Q6_K-GGUF
Hermes-4-14B-Q5_K_M-GGUF
NikolayKozloff/Hermes-4-14B-Q5KM-GGUF This model was converted to GGUF format from `NousResearch/Hermes-4-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
DeepSeek-Prover-V2-7B-Q8_0-GGUF
NikolayKozloff/DeepSeek-Prover-V2-7B-Q80-GGUF This model was converted to GGUF format from `deepseek-ai/DeepSeek-Prover-V2-7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Czech-GPT-2-XL-133k-GGUF
madlad400-3b-mt-Q8_0-GGUF
amoral-gemma3-4B-Q8_0-GGUF
pip-library-etl-1.3b-Q8_0-GGUF
Replete-Coder-Llama3-8B-Q5_0-GGUF
Gemma-2-9B-It-SPPO-Iter3-Q5_0-GGUF
GigaChat-20B-A3B-instruct-Q3_K_M-GGUF
gemma-3-12b-it-Q5_K_M-GGUF
JanusCoder-14B-Q5_K_S-GGUF
NikolayKozloff/JanusCoder-14B-Q5KS-GGUF This model was converted to GGUF format from `internlm/JanusCoder-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
tora-code-7b-v1.0
mGPT-1.3B-mari-GGUF
Qwen2-Math-7B-Instruct-Q8_0-GGUF
salt-asr_wav-uni_1_tts_wav-uni_1-12k-Q8_0-GGUF
gemma-3-12b-it-Q6_K-GGUF
Llama-3-8B-Instruct-Coder-Q8_0-GGUF
granite-3b-code-instruct-Q8_0-GGUF
Tesser-Llama-3-Ko-8B-Q4_0-GGUF
GemmaCoder3-12B-Q5_K_M-GGUF
Qwen2-7B-Q8_0-GGUF
PLLuM-12B-instruct-Q5_K_M-GGUF
Qwen2-7B-Instruct-Q4_0-GGUF
Qwen2-7B-Instruct-deccp-Q8_0-GGUF
Replete-Coder-Qwen2-1.5b-Q4_0-GGUF
Gemma-2-9B-It-SPPO-Iter3-Q4_K_S-GGUF
gemma-2-2b-jpn-it-Q8_0-GGUF
DeepSeek-R1-Distill-Llama-8B-Q8_0-GGUF
DeepSeek-R1-Distill-Qwen-7B-Q8_0-GGUF
Vikhr-Gemma-2B-instruct-Q8_0-GGUF
Llama-3-8B-Swedish-Norwegian-Danish-checkpoint-16000-11_6_2024-Q8_0-GGUF
EuroLLM-1.7B-Q8_0-GGUF
LFM2-2.6B-Q8_0-GGUF
NikolayKozloff/LFM2-2.6B-Q80-GGUF This model was converted to GGUF format from `LiquidAI/LFM2-2.6B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
SeaPhi3-mini-Q6_K-GGUF
Gromenauer-7B-Instruct-Q8_0-GGUF
Viking-7B-Q5_K_M-GGUF
deepthought-8b-llama-v0.01-alpha-Q8_0-GGUF
granite-20b-code-instruct-Q4_0-GGUF
Qwen2-7B-Q5_K_S-GGUF
Qwen2-7B-Q4_0-GGUF
Replete-Coder-Llama3-8B-Q4_0-GGUF
Phi-3-mini-4k-instruct-sq-LORA-F32-GGUF
Dolphin3.0-Qwen2.5-3b-Q8_0-GGUF
JanusCoder-14B-Q4_K_M-GGUF
NikolayKozloff/JanusCoder-14B-Q4KM-GGUF This model was converted to GGUF format from `internlm/JanusCoder-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
kappa-3-phi-abliterated-Q8_0-GGUF
Qwen2-7B-Q5_K_M-GGUF
Mistral-portuguese-luana-7b-Mathematics-GGUF
britllm-3b-v0.1-Q8_0-GGUF
RoLlama3-8b-Instruct-Q8_0-GGUF
Gemma-2-9B-It-SPPO-Iter3-Q5_K_S-GGUF
madlad400-10b-mt-Q6_K-GGUF
WizardLM-2-7B-abliterated-Q4_0-GGUF
Mistral-Small-24B-Instruct-2501-Q2_K-GGUF
NikolayKozloff/Mistral-Small-24B-Instruct-2501-Q2K-GGUF This model was converted to GGUF format from `mistralai/Mistral-Small-24B-Instruct-2501` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Llama-portuguese-13b-Luana-v0.2-GGUF
Falcon2-5.5B-Dutch-Q4_0-GGUF
Viking-7B-Q6_K-GGUF
Gemma-2-9B-It-SPPO-Iter3-IQ4_NL-GGUF
Gemma-2-9B-It-SPPO-Iter3-Q4_0-GGUF
SauerkrautLM-Nemo-12b-Instruct-Q5_K_S-GGUF
ghost-8b-beta-1608-Q8_0-GGUF
jais-13b-chat-Q2_K-GGUF
DeepSeek-R1-Distill-Qwen-14B-Q5_K_S-GGUF
amoral-gemma3-12B-Q5_K_M-GGUF
Llama-2-7b-Ukrainian-Q8_0-GGUF
LLaMA-Mesh-Q8_0-GGUF
Gemma-2-9B-It-SPPO-Iter3-Q8_0-GGUF
Falcon2-5.5B-Italian-Q8_0-GGUF
EuroLLM-9B-Q8_0-GGUF
cogito-v1-preview-llama-8B-Q8_0-GGUF
Qwen3-14B-Q4_K_M-GGUF
Hunyuan-0.5B-Instruct-Q8_0-GGUF
SauerkrautLM-7b-v1-mistral
Llama-2-13b-Romanian-GGUF
SauerkrautLM-Qwen-32b-Q3_K_S-GGUF
Yi-1.5-6B-Chat-Q4_K_M-GGUF
Qwen3-1.7B-abliterated-Q8_0-GGUF
mGPT-1.3B-tajik-GGUF
Qwen2-7B-Q4_K_S-GGUF
Replete-Coder-Qwen2-1.5b-Q5_0-GGUF
Llasa-3B-Q8_0-GGUF
granite-3.2-8b-instruct-preview-Q8_0-GGUF
reka-flash-3-Q3_K_S-GGUF
cogito-v1-preview-qwen-14B-Q4_K_M-GGUF
NextCoder-14B-Q4_K_S-GGUF
NikolayKozloff/NextCoder-14B-Q4KS-GGUF This model was converted to GGUF format from `microsoft/NextCoder-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
YanoljaNEXT-Rosetta-20B-Q2_K-GGUF
NikolayKozloff/YanoljaNEXT-Rosetta-20B-Q2K-GGUF This model was converted to GGUF format from `yanolja/YanoljaNEXT-Rosetta-20B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Llama-3-8B-Swedish-Norwegian-Danish-Q8_0-GGUF
DeepSeek-R1-Distill-Qwen-7B-Multilingual-Q8_0-GGUF
mGPT-1.3B-armenian-GGUF
EVA-GPT-German-Q6_K-GGUF
magnum-12b-v2.5-kto-Q5_K_M-GGUF
Hebrew-Gemma-11B-V2-Q6_K-GGUF
Phi-3-mini-4k-instruct-dansk-Q8_0-GGUF
Phi-3-medium-4k-instruct-Q6_K-GGUF
Qwen2-7B-Q5_0-GGUF
tabula-8b-Q5_0-GGUF
RoGemma-7b-Instruct-Q5_0-GGUF
Mistral-Nemo-Instruct-2407-Q5_K_S-GGUF
granite-3.0-8b-instruct-Q8_0-GGUF
BgGPT-Gemma-2-2.6B-IT-v1.0-Q8_0-GGUF
lb-reranker-0.5B-v1.0-Q8_0-GGUF
Dans-PersonalityEngine-V1.3.0-12b-Q5_K_M-GGUF
NikolayKozloff/Dans-PersonalityEngine-V1.3.0-12b-Q5KM-GGUF This model was converted to GGUF format from `PocketDoc/Dans-PersonalityEngine-V1.3.0-12b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Dhanishtha-2.0-preview-Q4_K_S-GGUF
gemma-3-4b-it-shqip-v3-Q8_0-GGUF
DeepSeek-R1-Distill-Qwen-1.5B-Multilingual-Q8_0-GGUF
NikolayKozloff/DeepSeek-R1-Distill-Qwen-1.5B-Multilingual-Q80-GGUF This model was converted to GGUF format from `lightblue/DeepSeek-R1-Distill-Qwen-1.5B-Multilingual` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
helium-1-2b-Q8_0-GGUF
NikolayKozloff/helium-1-2b-Q80-GGUF This model was converted to GGUF format from `kyutai/helium-1-2b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
L3-8B-Lunaris-v1-IQ4_NL-GGUF
Hermes-3-Llama-3.2-3B-Q8_0-GGUF
Hunyuan-1.8B-Instruct-Q8_0-GGUF
granite-20b-code-base-Q3_K_L-GGUF
strela-Q5_0-GGUF
SauerkrautLM-Gemma-2b-Q8_0-GGUF
GermanEduScorer-Qwen2-1.5b-Q8_0-GGUF
Viking-7B-Q5_K_S-GGUF
Llama-3-Instruct-Neurona-8b-v2-Q4_0-GGUF
magnum-12b-v2.5-kto-Q6_K-GGUF
saiga_nemo_12b-Q6_K-GGUF
Pensez-v0.1-e5-Q8_0-GGUF
gemma-3-12b-it-Q8_0-GGUF
Llasa-1B-Q8_0-GGUF
NikolayKozloff/Llasa-1B-Q80-GGUF This model was converted to GGUF format from `HKUST-Audio/Llasa-1B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
bleta-8B-v0.5-Albanian-shqip-GGUF
Phi-3-medium-4k-instruct-Q5_K_S-GGUF
GigaChat-20B-A3B-instruct-Q3_K_L-GGUF
q1-3B-PRIME-Q8_0-GGUF
mGPT-1.3B-uzbek-GGUF
polanka-qwen2-3b-v0.1-Q8_0-GGUF
Qwen2-1.5B-Ita-Q8_0-GGUF
Replete-Coder-Llama3-8B-IQ4_NL-GGUF
NuminaMath-7B-TIR-IQ4_NL-GGUF
it-5.4-fp16-orpo-v2-Q8_0-GGUF
Tiger-Gemma-9B-v1-Q4_0-GGUF
WoonaV1.2-9b-Q8_0-GGUF
jais-13b-chat-Q5_K_M-GGUF
Phi-4-reasoning-Q5_K_S-GGUF
AceReason-Nemotron-14B-Q5_K_S-GGUF
NextCoder-7B-Q8_0-GGUF
CodeQwen1.5-7B-Q8_0-GGUF
OpenCoder-8B-Instruct-Q5_K_M-GGUF
Llama-3-8B-Swedish-Norwegian-Danish-chekpoint-18833-1-epoch-15_6_2024-Q8_0-GGUF
gemma2-9B-sunfall-v0.5-Q8_0-GGUF
Qwen-portuguese-luana-7b-GGUF
Llama-3-8B-instruct-dansk-Q8_0-GGUF
Dorna-Llama3-8B-Instruct-IQ4_NL-GGUF
llama3-turbcat-instruct-8b-IQ4_NL-GGUF
Llama-3SOME-8B-v2-Q4_K_S-GGUF
Replete-Coder-Qwen2-1.5b-Q8_0-GGUF
gemma-2-9b-Q8_0-GGUF
Qwen2-1.5B-ITA-Instruct-Q8_0-GGUF
Tiger-Gemma-9B-v1-IQ4_NL-GGUF
Mistral-Nemo-Instruct-2407-Q5_K_M-GGUF
SauerkrautLM-Nemo-12b-Instruct-Q5_K_M-GGUF
Viking-Magnum-v0.1-7B-Q8_0-GGUF
BgGPT-Gemma-2-27B-IT-v1.0-Q2_K-GGUF
EXAONE-3.5-2.4B-Instruct-Q8_0-GGUF
granite-3.1-8b-instruct-Q8_0-GGUF
EXAONE-Deep-2.4B-Q8_0-GGUF
Dhanishtha-2.0-preview-Q5_K_S-GGUF
NikolayKozloff/Dhanishtha-2.0-preview-Q5KS-GGUF This model was converted to GGUF format from `HelpingAI/Dhanishtha-2.0-preview` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
MAmmoTH-Coder-7B-GGUF
granite-8b-code-base-Q8_0-GGUF
llama3_8b_chat_brainstorm-Q6_K-GGUF
granite-8b-code-instruct-Q6_K-GGUF
Mistral-Nemo-Kurdish-Q6_K-GGUF
phi-4-Q4_K_M-GGUF
ArmenianGPT-0.5-12B-Q8_0-GGUF
NikolayKozloff/ArmenianGPT-0.5-12B-Q80-GGUF This model was converted to GGUF format from `ArmGPT/ArmenianGPT-0.5-12B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Mixtral_AI_CyberTron_Swahili_7b-GGUF
llama-3-typhoon-v1.5-8b-instruct-Q6_K-GGUF
internlm2-math-plus-20b-Q4_0-GGUF
llama3-tweety-8b-italian-Q4_0-GGUF
L3-8B-Lunaris-v1-Q4_0-GGUF
gemma-2-27b-Q2_K-GGUF
RoGemma-7b-Instruct-Q4_K_L-GGUF
Replete-Coder-Instruct-8b-Merged-Q8_0-GGUF
Tiger-Gemma-9B-v1-Q8_0-GGUF
mistral-doryV2-12b-Q8_0-GGUF
OmniLing-V1-8b-experimental-Q8_0-GGUF
Phi-3-medium-4k-instruct-sq-LORA-F16-GGUF
Phi-3-medium-4k-instruct-sq-LORA-Q8_0-GGUF
OpenCoder-8B-Instruct-Q8_0-GGUF
GigaChat-20B-A3B-instruct-Q2_K-GGUF
cogito-v1-preview-qwen-14B-Q4_K_S-GGUF
Confucius3-Math-Q5_K_S-GGUF
Datarus-R1-14B-preview-Q4_K_M-GGUF
NikolayKozloff/Datarus-R1-14B-preview-Q4KM-GGUF This model was converted to GGUF format from `DatarusAI/Datarus-R1-14B-preview` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
YanoljaNEXT-Rosetta-12B-Q8_0-GGUF
Llama3-DocChat-1.0-8B-Q8_0-GGUF
SambaLingo-Hungarian-Chat-GGUF
Ko-Qwen2-7B-Instruct-Q8_0-GGUF
Replete-LLM-Qwen2-7b_Beta-Preview-Q8_0-GGUF
Mistral-NeMo-Minitron-8B-Base-Q8_0-GGUF
Muyan-TTS-Q8_0-GGUF
suzume-llama-3-8B-multilingual-orpo-borda-full-Q8_0-GGUF
Qwen2-7B-Instruct-Q5_K_S-GGUF
leniachat-qwen2-1.5B-v0-Q8_0-GGUF
LLAMA-3_8B_Unaligned_Alpha-Q8_0-GGUF
tabula-8b-IQ4_NL-GGUF
Gemma-2-9B-It-SPPO-Iter3-IQ4_XS-GGUF
RoGemma-7b-Instruct-Q4_0-GGUF
RoGemma-7b-Instruct-Q6_K_L-GGUF
RoGemma-7b-Instruct-Q5_K_L-GGUF
Viking-13B-Q4_0-GGUF
Llama-3-Instruct-Neurona-8b-v2-Q5_0-GGUF
RoLlama3-8b-Instruct-Q4_K_L-GGUF
MegaBeam-Mistral-7B-512k-Q8_0-GGUF
Llama-3.1-SauerkrautLM-8b-Instruct-Q8_0-GGUF
Bielik-11B-v2.3-Instruct-Q5_K_M-GGUF
Mistral-Nemo-Instruct-bellman-12b-Q5_K_M-GGUF
GigaChat-20B-A3B-instruct-Q4_K_S-GGUF
AceInstruct-7B-Q8_0-GGUF
amoral-gemma3-12B-Q8_0-GGUF
NVIDIA-Nemotron-Nano-12B-v2-Q5_K_M-GGUF
NikolayKozloff/NVIDIA-Nemotron-Nano-12B-v2-Q5KM-GGUF This model was converted to GGUF format from `nvidia/NVIDIA-Nemotron-Nano-12B-v2` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Hermes-4-14B-Q5_K_S-GGUF
NikolayKozloff/Hermes-4-14B-Q5KS-GGUF This model was converted to GGUF format from `NousResearch/Hermes-4-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Llama-3.1-Minitron-4B-Width-Base-Q8_0-GGUF
PULI-LlumiX-32K-GGUF
Meltemi-7B-v1-GGUF
Aura-Llama-Abliterated-Q8_0-GGUF
LongWriter-llama3.1-8b-Q8_0-GGUF
Llama-3.1-8B-Instruct-Reasoner-1o1_v0.3-Q8_0-GGUF
NVIDIA-Nemotron-Nano-12B-v2-Q6_K-GGUF
NikolayKozloff/NVIDIA-Nemotron-Nano-12B-v2-Q6K-GGUF This model was converted to GGUF format from `nvidia/NVIDIA-Nemotron-Nano-12B-v2` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
llama-7b-finnish-v2-Q8_0-GGUF
Llama-3-KafkaLM-8B-v0.1-Q8_0-GGUF
Yi-1.5-9B-Chat-Q4_K_M-GGUF
tyr-Q8_0-GGUF
Llama-3-Instruct-8B-SimPO-Q5_0-GGUF
AlchemistCoder-DS-6.7B-Q5_0-GGUF
Llama-3-Oasis-v1-OAS-8B-Q4_0-GGUF
Llama-3-neoAI-8B-Chat-v0.1-Q4_0-GGUF
Llama-3-Instruct-Neurona-8b-v2-IQ4_NL-GGUF
SeaLLM3-7B-Chat-Q8_0-GGUF
ArliAI-Llama-3-8B-Formax-v1.0-Q5_0-GGUF
ArliAI-Llama-3-8B-Formax-v1.0-IQ4_NL-GGUF
NuminaMath-7B-TIR-Q8_0-GGUF
SauerkrautLM-Nemo-12b-Instruct-Q8_0-GGUF
SauerkrautLM-Nemo-12b-Instruct-Q6_K-GGUF
falcon-mamba-7b-Q8_0-GGUF
Viking-SlimSonnet-v1-7B-Q8_0-GGUF
OpenCoder-8B-Instruct-Q6_K-GGUF
BgGPT-Gemma-2-9B-IT-v1.0-Q8_0-GGUF
amoral-gemma3-12B-Q6_K-GGUF
Qwen3-14B-Q5_K_S-GGUF
NextCoder-14B-Q4_K_M-GGUF
NikolayKozloff/NextCoder-14B-Q4KM-GGUF This model was converted to GGUF format from `microsoft/NextCoder-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
YanoljaNEXT-Rosetta-12B-2510-Q5_K_S-GGUF
DeepSeek-R1-ReDistill-Qwen-7B-v1.1-Q8_0-GGUF
Seed-X-Instruct-7B-Q8_0-GGUF
Diver-Retriever-4B-Q8_0-GGUF
Llama-3-8b-ita-ties-Q8_0-GGUF
Phi-3-medium-128k-instruct-Q4_0-GGUF
Llama-3-Instruct-8B-SimPO-Q4_0-GGUF
shotor-Q8_0-GGUF
Llama-3-Instruct-8B-SPPO-Iter3-IQ4_NL-GGUF
Arcee-Spark-FP32-Q8_0-GGUF
Viking-7B-Q4_0-GGUF
gemma-2-9b-it-Q8_0-GGUF
RoLlama3-8b-Instruct-Q8_0_L-GGUF
gemma2-9B-daybreak-v0.5-Q8_0-GGUF
Einstein-v7-Qwen2-7B-Q8_0-GGUF
Gemma-2-9b-indic-Q8_0-GGUF
Mistral-Nemo-Instruct-2407-Q6_K-GGUF
Replete-LLM-V2.5-Qwen-1.5b-Q8_0-GGUF
Replete-LLM-V2.5-Qwen-32b-Q3_K_S-GGUF
Mistral-Nemo-Kurdish-Instruct-Q5_K_S-GGUF
qwen2.5-7b-ins-v3-Q8_0-GGUF
phi-4-Q5_K_S-GGUF
zeta-Q8_0-GGUF
Meta-Llama-3.1-8B-SurviveV3-Q8_0-GGUF
NikolayKozloff/Meta-Llama-3.1-8B-SurviveV3-Q80-GGUF This model was converted to GGUF format from `lolzinventor/Meta-Llama-3.1-8B-SurviveV3` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Qwen3-0.6B-Q8_0-GGUF
AceReason-Nemotron-1.1-7B-Q8_0-GGUF
NikolayKozloff/AceReason-Nemotron-1.1-7B-Q80-GGUF This model was converted to GGUF format from `nvidia/AceReason-Nemotron-1.1-7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
OpenCodeReasoning-Nemotron-1.1-7B-Q8_0-GGUF
NikolayKozloff/OpenCodeReasoning-Nemotron-1.1-7B-Q80-GGUF This model was converted to GGUF format from `nvidia/OpenCodeReasoning-Nemotron-1.1-7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
OpenCodeReasoning-Nemotron-1.1-14B-Q4_K_M-GGUF
NikolayKozloff/OpenCodeReasoning-Nemotron-1.1-14B-Q4KM-GGUF This model was converted to GGUF format from `nvidia/OpenCodeReasoning-Nemotron-1.1-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
OpenReasoning-Nemotron-7B-Q8_0-GGUF
NikolayKozloff/OpenReasoning-Nemotron-7B-Q80-GGUF This model was converted to GGUF format from `nvidia/OpenReasoning-Nemotron-7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Piaget-1.7B-Q8_0-GGUF
NikolayKozloff/Piaget-1.7B-Q80-GGUF This model was converted to GGUF format from `gustavecortal/Piaget-1.7B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
YanoljaNEXT-Rosetta-12B-Q4_K_M-GGUF
tora-13b-v1.0
h2o-danube3-500m-base-Q8_0-GGUF
Sailor-7B-Q8_0-GGUF
Nxcode-CQ-7B-orpo-Q6_K-GGUF
Falcon2-5.5B-Swedish-Q8_0-GGUF
Alphacode-MALI-9B-Q8_0-GGUF
Phi-3-medium-4k-instruct-Q4_0-GGUF
Phi-3-medium-128k-instruct-Q5_K_S-GGUF
Awanllm-Llama-3-8B-Cumulus-v0.3.2-Q5_0-GGUF
AlchemistCoder-DS-6.7B-Q4_0-GGUF
Llama-3-Steerpike-v1-OAS-8B-Q5_0-GGUF
h2o-Llama-3-8B-Japanese-Instruct-Q8_0-GGUF
Llama-3-Instruct-8B-SPPO-Iter3-Q4_0-GGUF
Turkish-Llama-8b-Instruct-v0.1-IQ4_NL-GGUF
Viking-7B-Q8_0-GGUF
RoGemma-7b-Instruct-Q8_0-GGUF
RoLlama3-8b-Instruct-Q5_K_L-GGUF
Viking-13B-Q4_K_M-GGUF
ParaLex-Llama-3-8B-SFT-Q8_0-GGUF
ArliAI-Llama-3-8B-Formax-v1.0-Q4_0-GGUF
mathstral-7B-v0.1-Q8_0-GGUF
falcon-mamba-7b-instruct-Q8_0-GGUF
jais-13b-chat-Q3_K_L-GGUF
Mistral-Small-Instruct-2409-Q3_K_L-GGUF
Mistral-Small-Instruct-2409-Q2_K-GGUF
polanka-qwen2-1.5b-v0.1-ckpt_401000-Q8_0-GGUF
Replete-LLM-V2.5-Qwen-14b-Q5_K_M-GGUF
OpenCoder-1.5B-Instruct-Q8_0-GGUF
FuseChat-Qwen-2.5-7B-Instruct-Q8_0-GGUF
NikolayKozloff/FuseChat-Qwen-2.5-7B-Instruct-Q80-GGUF This model was converted to GGUF format from `FuseAI/FuseChat-Qwen-2.5-7B-Instruct` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Human-Like-Mistral-Nemo-Instruct-2407-Q6_K-GGUF
GLM-Z1-9B-0414-Q8_0-GGUF
Polaris-4B-Preview-Q8_0-GGUF
OpenCodeReasoning-Nemotron-1.1-14B-Q5_K_S-GGUF
NikolayKozloff/OpenCodeReasoning-Nemotron-1.1-14B-Q5KS-GGUF This model was converted to GGUF format from `nvidia/OpenCodeReasoning-Nemotron-1.1-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
MiniCPM4.1-8B-Q5_K_S-GGUF
NikolayKozloff/MiniCPM4.1-8B-Q5KS-GGUF This model was converted to GGUF format from `openbmb/MiniCPM4.1-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
ArmenianGPT-0.5-12B-Q4_K_M-GGUF
NikolayKozloff/ArmenianGPT-0.5-12B-Q4KM-GGUF This model was converted to GGUF format from `ArmGPT/ArmenianGPT-0.5-12B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
SauerkrautLM-3b-v1
SauerkrautLM-13b-v1
SFR-SFT-LLaMA-3-8B-R-Q8_0-GGUF
SFR-Iterative-DPO-LLaMA-3-8B-R-Q8_0-GGUF
Llama-3.1-Hawkish-8B-Q8_0-GGUF
NikolayKozloff/Llama-3.1-Hawkish-8B-Q80-GGUF This model was converted to GGUF format from `mukaj/Llama-3.1-Hawkish-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Llama-3-8B-dutch-GGUF
llama-3-llamilitary-Q8_0-GGUF
Meta-Llama-3.1-8B-Instruct-Q8_0-GGUF
granite-3b-code-base-Q8_0-GGUF
AutoCoder_S_6.7B-Q8_0-GGUF
Irbis-7b-v0.1-Kazakh-Q8_0-GGUF
bella-1-8b-Q8_0-GGUF
Mistral-Nemo-12B-ArliAI-RPMax-v1.1-Q5_K_M-GGUF
tora-7b-v1.0
SambaLingo-Bulgarian-Chat-GGUF
Llama-3-portuguese-Tom-cat-8b-instruct-Q6_K-GGUF
openchat-3.6-8b-20240522-Q8_0-GGUF
L3-Aethora-15B-Q5_K_S-GGUF
L3-Aethora-15B-Q6_K-GGUF
L3-Aethora-15B-Q5_0-GGUF
Ko-Llama-3-8B-Instruct-Q8_0-GGUF
Tiger-Gemma-9B-v1-Q5_0-GGUF
Meta-Llama-3.1-8B-Q8_0-GGUF
mistral-doryV2-12b-Q6_K-GGUF
Llama-3.1-Minitron-4B-Depth-Base-Q8_0-GGUF
Viking-SlimSonnet-v0.2-7B-Q8_0-GGUF
ChatFrame-Instruct-Persian-Small-Q8_0-GGUF
pansophic-1-preview-LLaMA3.1-8b-Q8_0-GGUF
MagpieLM-4B-Chat-v0.1-Q8_0-GGUF
Replete-LLM-V2.5-Qwen-7b-Q8_0-GGUF
Replete-LLM-V2.5-Qwen-0.5b-Q8_0-GGUF
Llama-eus-8B-Q8_0-GGUF
OpenCoder-1.5B-Base-Q8_0-GGUF
FuseChat-Gemma-2-9B-Instruct-Q8_0-GGUF
NikolayKozloff/FuseChat-Gemma-2-9B-Instruct-Q80-GGUF This model was converted to GGUF format from `FuseAI/FuseChat-Gemma-2-9B-Instruct` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
DeepSeek-R1-Distill-Qwen-14B-Multilingual-Q5_K_S-GGUF
Phi-4-reasoning-plus-Q4_K_M-GGUF
NikolayKozloff/Phi-4-reasoning-plus-Q4KM-GGUF This model was converted to GGUF format from `microsoft/Phi-4-reasoning-plus` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
ERNIE-4.5-0.3B-PT-Q8_0-GGUF
NikolayKozloff/ERNIE-4.5-0.3B-PT-Q80-GGUF This model was converted to GGUF format from `baidu/ERNIE-4.5-0.3B-PT` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Dhanishtha-2.0-preview-Q4_K_M-GGUF
OpenCodeReasoning-Nemotron-1.1-14B-Q4_K_S-GGUF
NikolayKozloff/OpenCodeReasoning-Nemotron-1.1-14B-Q4KS-GGUF This model was converted to GGUF format from `nvidia/OpenCodeReasoning-Nemotron-1.1-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
ArmenianGPT-0.1-12B-Q5_K_M-GGUF
NikolayKozloff/ArmenianGPT-0.1-12B-Q5KM-GGUF This model was converted to GGUF format from `ArmGPT/ArmenianGPT-0.1-12B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
ERNIE-4.5-21B-A3B-Thinking-Q3_K_S-GGUF
NikolayKozloff/ERNIE-4.5-21B-A3B-Thinking-Q3KS-GGUF This model was converted to GGUF format from `baidu/ERNIE-4.5-21B-A3B-Thinking` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
YandexGPT-5-Lite-8B-pretrain-Q8_0-GGUF
L3-Aethora-15B-V2-Q5_K_S-GGUF
Selene-1-Mini-Llama-3.1-8B-Q6_K-GGUF
Chocolatine-8B-Instruct-DPO-v1.0-Q8_0-GGUF
L3-Aethora-15B-V2-Q4_K_M-GGUF
L3-8B-Everything-COT-Q8_0-GGUF
L3-8B-Celeste-V1.2-Q8_0-GGUF
llama-3-Nephilim-v3-8B-Q8_0-GGUF
orcapaca_albanian-Q5_K_M-GGUF
NightyGurps-12b-v1-experimental-Q8_0-GGUF
Dans-PersonalityEngine-V1.3.0-12b-Q6_K-GGUF
NikolayKozloff/Dans-PersonalityEngine-V1.3.0-12b-Q6K-GGUF This model was converted to GGUF format from `PocketDoc/Dans-PersonalityEngine-V1.3.0-12b` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
ALMA-7B-GGUF
Heidrun-Mistral-7B-chat-Q8_0-GGUF
MATH-BG-v1-7B-GGUF
dictalm2.0-instruct-Q6_K-GGUF
malaysian-llama-3-8b-instruct-16k-Q8_0-GGUF
EVA-GPT-German-v7-2-Beta-Q5_K_M-GGUF
shisa-v1-llama3-8b-Q8_0-GGUF
Awanllm-Llama-3-8B-Cumulus-v0.3.2-Q4_0-GGUF
Llama3-German-8B-Q8_0-GGUF
Llama3-DiscoLeo-Instruct-8B-v0.1-Q8_0-GGUF
Llama3-DiscoLeo-Instruct-8B-32k-v0.1-Q8_0-GGUF
suzume-llama-3-8B-multilingual-orpo-borda-half-Q5_K_M-GGUF
suzume-llama-3-8B-multilingual-orpo-borda-top75-Q8_0-GGUF
Llama-3-11.5B-V2-Q4_0-GGUF
Llama-3-11.5B-Instruct-V2-Q5_0-GGUF
Llama-3-Ultron-Q8_0-GGUF
Awanllm-Llama-3-8B-Cumulus-v1.0-Q4_0-GGUF
Awanllm-Llama-3-8B-Cumulus-v1.0-Q5_0-GGUF
Llama-3-8B-instruct-Swedish-Norwegian-Danish-Q8_0-GGUF
Llama-3-8B-Swedish-Norwegian-Danish-checkpoint-11525-03_6_2024-Q8_0-GGUF
L3-Aethora-15B-Q4_K_S-GGUF
L3-Aethora-15B-Q5_K_M-GGUF
L3-Aethora-15B-Q8_0-GGUF
L3-Aethora-15B-Q4_0-GGUF
Llama-3-8B-Swedish-Norwegian-Danish-checkpoint-14375-08_06_2024-Q8_0-GGUF
Llama-3-Oasis-v1-OAS-8B-Q5_0-GGUF
Llama-3-Steerpike-v1-OAS-8B-Q4_0-GGUF
CataLlama-v0.1-Instruct-SFT-Q8_0-GGUF
CataLlama-v0.1-Instruct-DPO-Q8_0-GGUF
Tesser-Llama-3-Ko-8B-Q5_0-GGUF
Dorna-Llama3-8B-Instruct-IQ4_XS-GGUF
SauerkrautLM-1.5b-Q4_0-GGUF
tabula-8b-Q4_0-GGUF
Morfoz-LLM-8b-v1.0-IQ4_NL-GGUF
Llama-3-Instruct-8B-SPPO-Iter3-Q5_0-GGUF
Llama-3-neoAI-8B-Chat-v0.1-Q5_0-GGUF
Llama-3-neoAI-8B-Chat-v0.1-IQ4_NL-GGUF
RoLlama3-8b-Instruct-Q5_0-GGUF
RoLlama3-8b-Instruct-Q6_K_L-GGUF
Viking-13B-Q5_K_M-GGUF
bella-2-8b-Q8_0-GGUF
ArliAI-Llama-3-8B-Formax-v1.0-IQ4_XS-GGUF
mistral-doryV2-12b-Q5_K_M-GGUF
mistral-doryV2-12b-Q5_K_S-GGUF
L3.1-8B-Celeste-V1.5-Q8_0-GGUF
uzbek-llama-3.1-8B-instruct-v2-Q8_0-GGUF
Duet_Minitron8b_v0.5-Q8_0-GGUF
ChatFrame-Q8_0-GGUF
Mistral-Nemo-12B-ArliAI-RPMax-v1.1-Q6_K-GGUF
Llama-3.1-8B-ArliAI-RPMax-v1.1-Q8_0-GGUF
jais-13b-chat-Q3_K_S-GGUF
MagpieLM-8B-SFT-v0.1-Q8_0-GGUF
Replete-LLM-V2.5-Qwen-14b-Q5_K_S-GGUF
Replete-LLM-V2.5-Qwen-32b-Q4_K_M-GGUF
Mistral-NeMo-Minitron-8B-Instruct-Q8_0-GGUF
FastApply-1.5B-v1.0-Q6_K-GGUF
OpenCoder-8B-Instruct-Q5_K_S-GGUF
Teuken-7B-instruct-research-v0.4-Q8_0-GGUF
Llama-3-ChocoLlama-8B-instruct-Q8_0-GGUF
SauerkrautLM-v2-14b-DPO-Q5_K_M-GGUF
NikolayKozloff/SauerkrautLM-v2-14b-DPO-Q5KM-GGUF This model was converted to GGUF format from `VAGOsolutions/SauerkrautLM-v2-14b-DPO` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
OREAL-7B-Q8_0-GGUF
Qwen3-14B-Q4_K_S-GGUF
Phi-4-reasoning-Q4_K_S-GGUF
NikolayKozloff/Phi-4-reasoning-Q4KS-GGUF This model was converted to GGUF format from `microsoft/Phi-4-reasoning` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Llama-3.1-Nemotron-Nano-4B-v1.1-Q8_0-GGUF
NikolayKozloff/Llama-3.1-Nemotron-Nano-4B-v1.1-Q80-GGUF This model was converted to GGUF format from `nvidia/Llama-3.1-Nemotron-Nano-4B-v1.1` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
OpenReasoning-Nemotron-14B-Q5_K_S-GGUF
NikolayKozloff/OpenReasoning-Nemotron-14B-Q5KS-GGUF This model was converted to GGUF format from `nvidia/OpenReasoning-Nemotron-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
OpenReasoning-Nemotron-14B-Q4_K_S-GGUF
NikolayKozloff/OpenReasoning-Nemotron-14B-Q4KS-GGUF This model was converted to GGUF format from `nvidia/OpenReasoning-Nemotron-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
Hunyuan-4B-Instruct-Q8_0-GGUF
Goedel-Prover-V2-8B-Q8_0-GGUF
NikolayKozloff/Goedel-Prover-V2-8B-Q80-GGUF This model was converted to GGUF format from `Goedel-LM/Goedel-Prover-V2-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
HyGPT-10b-it-Q8_0-GGUF
silly-v0.2-Q6_K-GGUF
NikolayKozloff/silly-v0.2-Q6K-GGUF This model was converted to GGUF format from `wave-on-discord/silly-v0.2` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
MiniCPM4.1-8B-Q5_K_M-GGUF
NikolayKozloff/MiniCPM4.1-8B-Q5KM-GGUF This model was converted to GGUF format from `openbmb/MiniCPM4.1-8B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).