AesSedai

20 models • 2 total models in database
Sort by:

Qwen3.5-35B-A3B-GGUF

NaNK
31,247
65

Qwen3 Next 80B A3B Instruct GGUF

NaNK
16,812
5

Qwen3.5-122B-A10B-GGUF

NaNK
16,766
35

Qwen3.5-397B-A17B-GGUF

NaNK
10,623
22

Step-3.5-Flash-GGUF

2,060
8

GLM-4.6-Derestricted-GGUF

NaNK
1,107
14

Kimi-K2.5-GGUF

NaNK
1,003
26

MiniMax-M2.7-GGUF

NaNK
621
14

GLM-4.6-GGUF

601
4

GLM-4.7-GGUF

NaNK
575
5

GLM-4.5-GGUF

NaNK
license:mit
305
6

Kimi-K2.5

NaNK
214
9

GLM 4.6 REAP 266B A32B

Note: currently non-functional because of missing `mtp.safetensors` file and entry in `model.safetensors.index.json` Forked from https://github.com/CerebrasResearch/reap to https://github.com/AesSedai/reap to hack in GLM-4.6 support.

NaNK
169
5

GLM-4.6-REAP-178B-A32B

Note: currently non-functional because of missing `mtp.safetensors` file and entry in `model.safetensors.index.json` Forked from https://github.com/CerebrasResearch/reap to https://github.com/AesSedai/reap to hack in GLM-4.6 support.

NaNK
77
2

DeepSeek-V3-0324-GGUF

This is a custom quant of DeepSeek's V3 0324 model that has the following: - Q80 for the default quantization type (attention, shared experts, etc.) - Q4K for the FFNUP and FFNGATE tensors - Q5K for the FFNDOWN tensors The idea being that given the huge size of the FFN tensors compared to the rest of the tensors in the model, it should be possible to achieve a better quality while keeping the overall size of the entire model smaller compared to a similar naive quantization.

NaNK
67
0

NVIDIA-Nemotron-3-Super-120B-A12B-GGUF

NaNK
40
4

MiniMax-M2.5-GGUF

NaNK
26
3

DeepSeek-R1-0528-GGUF

NaNK
26
0

Step-3.5-Flash-Base-Midtrain-GGUF

0
2

GLM-5-GGUF

NaNK
0
2