AesSedai

20 models • 2 total models in database

Sort by:

GLM 4.6 REAP 266B A32B

Note: currently non-functional because of missing `mtp.safetensors` file and entry in `model.safetensors.index.json` Forked from https://github.com/CerebrasResearch/reap to https://github.com/AesSedai/reap to hack in GLM-4.6 support.

NaNK

—

169

GLM-4.6-REAP-178B-A32B

NaNK

—

DeepSeek-V3-0324-GGUF

This is a custom quant of DeepSeek's V3 0324 model that has the following: - Q80 for the default quantization type (attention, shared experts, etc.) - Q4K for the FFNUP and FFNGATE tensors - Q5K for the FFNDOWN tensors The idea being that given the huge size of the FFN tensors compared to the rest of the tensors in the model, it should be possible to achieve a better quality while keeping the overall size of the entire model smaller compared to a similar naive quantization.

NaNK

—