sm54
GLM 4.6 MXFP4 MOE
GLM-4.6-REAP-268B-A32B-128GB-GGUF
GLM 4.6 REAP 218B A32B MXFP4 MOE
GLM-4.6-REAP-268B-A32B-MXFP4_MOE
Qwen3-235B-A22B-Thinking-2507-MXFP4_MOE
Qwen3-235B-A22B-Instruct-2507-MXFP4_MOE
Qwen3-235B-A22B-Thinking-2507-OPT-GGUF
Qwen3-30B-A3B-Thinking-2507-OPT-GGUF
Qwen3-Nemotron-32B-RLBFF-Q4_K_M
GLM-4.5-MXFP4_MOE
Qwen3-30B-A3B-Thinking-2507-Q5_K_M-GGUF
sm54/Qwen3-30B-A3B-Thinking-2507-Q5KM-GGUF This model was converted to GGUF format from `Qwen/Qwen3-30B-A3B-Thinking-2507` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
QwQ-DeepSeek-R1-SkyT1-Flash-Lightest-32B-Q4_K_M-GGUF
gemma-3-27b-it-Q4_K_M-GGUF
OpenReasoning-Nemotron-14B-Q6_K-GGUF
sm54/OpenReasoning-Nemotron-14B-Q6K-GGUF This model was converted to GGUF format from `nvidia/OpenReasoning-Nemotron-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).
OpenReasoning-Nemotron-32B-Q4_K_M-GGUF
sm54/OpenReasoning-Nemotron-32B-Q4KM-GGUF This model was converted to GGUF format from `nvidia/OpenReasoning-Nemotron-32B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).