AesSedai
Qwen3.5-35B-A3B-GGUF
Qwen3 Next 80B A3B Instruct GGUF
Qwen3.5-122B-A10B-GGUF
Qwen3.5-397B-A17B-GGUF
Step-3.5-Flash-GGUF
GLM-4.6-Derestricted-GGUF
Kimi-K2.5-GGUF
MiniMax-M2.7-GGUF
GLM-4.6-GGUF
GLM-4.7-GGUF
GLM-4.5-GGUF
Kimi-K2.5
GLM 4.6 REAP 266B A32B
Note: currently non-functional because of missing `mtp.safetensors` file and entry in `model.safetensors.index.json` Forked from https://github.com/CerebrasResearch/reap to https://github.com/AesSedai/reap to hack in GLM-4.6 support.
GLM-4.6-REAP-178B-A32B
Note: currently non-functional because of missing `mtp.safetensors` file and entry in `model.safetensors.index.json` Forked from https://github.com/CerebrasResearch/reap to https://github.com/AesSedai/reap to hack in GLM-4.6 support.
DeepSeek-V3-0324-GGUF
This is a custom quant of DeepSeek's V3 0324 model that has the following: - Q80 for the default quantization type (attention, shared experts, etc.) - Q4K for the FFNUP and FFNGATE tensors - Q5K for the FFNDOWN tensors The idea being that given the huge size of the FFN tensors compared to the rest of the tensors in the model, it should be possible to achieve a better quality while keeping the overall size of the entire model smaller compared to a similar naive quantization.