noctrex

120 models • 12 total models in database

Sort by:

Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated-GGUF

NaNK

—

18,290

gemma-4-26B-A4B-it-MXFP4_MOE-GGUF

NaNK

—

16,624

Chandra-OCR-GGUF

Original model: https://huggingface.co/datalab-to/chandra Try to use the best quality you can run. For the mmproj, try to use the F32 version as it will produce the best results. F32 > BF16 > F16

—

14,153

GLM-4.7-Flash-MXFP4_MOE-GGUF

—

9,455

Qwen3-Coder-Next-MXFP4_MOE-GGUF

—

6,981

LightOnOCR-1B-1025-i1-GGUF

This are the imatrix quantizations of the model LightOnOCR-1B-1025 Original model: https://huggingface.co/lightonai/LightOnOCR-1B-1025 Try to use the best quality you can run. For the mmproj, try to use the F32 version as it will produce the best results. F32 > BF16 > F16

NaNK

—

6,396

GLM-4.5-Air-REAP-82B-A12B-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model GLM-4.5-Air-REAP-82B-A12B Original model: https://huggingface.co/cerebras/GLM-4.5-Air-REAP-82B-A12B

NaNK

—

2,992

Qwen3 VL 32B Thinking GGUF

NaNK

—

2,642

Qwen3.5-122B-A10B-MXFP4_MOE-GGUF

NaNK

—

2,578

Huihui-Qwen3-VL-8B-Instruct-abliterated-GGUF

NaNK

—

2,556

Pixtral-12B-Captioner-Relaxed-GGUF

NaNK

—

2,414

Huihui Qwen3 VL 30B A3B Instruct Abliterated MXFP4 MOE GGUF

NaNK

—

2,322

Huihui-Mistral-Small-3.2-24B-Instruct-2506-abliterated-v2-GGUF

These are quantizations of the model Huihui-Mistral-Small-3.2-24B-Instruct-2506-abliterated-v2 Original model: https://huggingface.co/huihui-ai/Huihui-Mistral-Small-3.2-24B-Instruct-2506-abliterated-v2

NaNK

—

2,200

Qwen3-Coder-REAP-25B-A3B-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Qwen3-Coder-REAP-25B-A3B Added an imatrix version, based on the imatrix from bartowski. I also created my own imatrix versions, which are marked as codetiny-exp and codemedium-exp. This is considered experimental. What I did, is that I took a very specific dataset, that is ONLY for coding and not for general knowledge. It's codetiny dataset from eaddario/imatrix-calibration And codemedium dataset from eaddario/imatrix-calibration I thought that would be better suited, as this a coding specific model. But further tests must be done. Please provide feedback. Original model: https://huggingface.co/cerebras/Qwen3-Coder-REAP-25B-A3B

NaNK

—

2,128

MiniMax-M2-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model MiniMax-M2 Original model: https://huggingface.co/unsloth/MiniMax-M2 It seems that the original model I quantized had chat template problems, so I re-quantized the unsloth version of it that has template fixes. Please delete the old one and download the new quant.

NaNK

—

1,974

Gelato-30B-A3B-i1-GGUF

NaNK

—

1,969

Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-MXFP4_MOE-GGUF

NaNK

—

1,704

Gelato-30B-A3B-GGUF

These are quantizations of the model Gelato-30B-A3B The imatrix has been used from mradermacher. As most of the quants are available from the great mradermacher team, I will include here only the quants that are missing. Usage Notes: - Download the latest llama.cpp to use these quantizations. - Try to use the best quality you can run. - For the `mmproj` file, the F32 version is recommended for best results (F32 > BF16 > F16).

NaNK

—

1,542

Huihui-gpt-oss-120b-abliterated-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Huihui-gpt-oss-120b-BF16-abliterated-v2 Original model: https://huggingface.co/huihui-ai/Huihui-gpt-oss-120b-BF16-abliterated-v2

NaNK

—

1,390

Qwen3-VL-235B-A22B-Instruct-1M-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Qwen3-VL-235B-A22B-Instruct Original model: https://huggingface.co/unsloth/Qwen3-VL-235B-A22B-Instruct

NaNK

—

1,385

Qwen3-VL-235B-A22B-Thinking-1M-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Qwen3-VL-235B-A22B-Thinking Original model: https://huggingface.co/unsloth/Qwen3-VL-235B-A22B-Thinking-1M This is the version from unloth that has expanded the context size from 256k to 1M.

NaNK

—

1,358

Qwen3-VL-30B-A3B-Thinking-abliterated-GGUF

NaNK

—

1,125

LightOnOCR-1B-1025-GGUF

This are the quantizations of the model LightOnOCR-1B-1025 Original model: https://huggingface.co/lightonai/LightOnOCR-1B-1025 Try to use the best quality you can run. For the mmproj, try to use the F32 version as it will produce the best results. F32 > BF16 > F16

NaNK

—

995

Huihui-Qwen3-VL-8B-Thinking-abliterated-GGUF

NaNK

—

937

Huihui-gpt-oss-20b-abliterated-v2-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Huihui-gpt-oss-20b-BF16-abliterated-v2 Original model: https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated-v2

NaNK

—

862

InternVL3_5-30B-A3B-MXFP4_MOE-GGUF

NaNK

—

853

Huihui-Qwen3-VL-2B-Instruct-abliterated-GGUF

These are quantizations of the model Huihui-Qwen3-VL-2B-Instruct-abliterated. These quantizations were created using an imatrix merged from combined\all\large and harmful.txt to leverage the abliterated nature of the model. Usage Notes: - Download the latest llama.cpp to use these quantizations. - Try to use the best quality you can run. - For the `mmproj` file, the F32 version is recommended for best results (F32 > BF16 > F16).

NaNK

—

810

Nemotron-3-Nano-30B-A3B-MXFP4_MOE-GGUF

NaNK

—

807

Qwen3-Next-80B-A3B-Thinking-1M-MXFP4_MOE-GGUF

NaNK

—

801

Tongyi DeepResearch 30B A3B MXFP4 MOE GGUF

This is a MXFP4MOE quantization of the model Tongyi-DeepResearch-30B-A3B Original model: https://huggingface.co/Alibaba-NLP/Tongyi-DeepResearch-30B-A3B

NaNK

—

797

Huihui-Qwen3-VL-4B-Thinking-abliterated-GGUF

NaNK

—

789

cogito-v2-preview-llama-109B-MoE-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model cogito-v2-preview-llama-109B-MoE Model quantized with BF16 GGUF's from: https://huggingface.co/unsloth/cogito-v2-preview-llama-109B-MoE-GGUF Original model: https://huggingface.co/deepcogito/cogito-v2-preview-llama-109B-MoE

NaNK

base_model:deepcogito/cogito-v2-preview-llama-109B-MoE

777

Qwen3-Next-80B-A3B-Instruct-1M-MXFP4_MOE-GGUF

NaNK

—

717

DeepSeek-MoE-16B-Chat-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model DeepSeek-MoE-16B-Chat Original model: https://huggingface.co/deepseek-ai/deepseek-moe-16b-chat

NaNK

—

619

Qwen3 30B A3B CoderThinking YOYO Linear MXFP4 MOE GGUF

This is a MXFP4MOE quantization of the model Qwen3-30B-A3B-CoderThinking-YOYO-linear Original model: https://huggingface.co/YOYO-AI/Qwen3-30B-A3B-CoderThinking-YOYO-linear

NaNK

—

586

Ling-flash-2.0-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Ling-flash-2.0 Original model: https://huggingface.co/inclusionAI/Ling-flash-2.0

NaNK

—

535

Qwen3-30B-A3B-Deepseek-Distill-Instruct-2507-MXFP4_MOE-GGUF

NaNK

—

533

GLM-4.7-Flash-REAP-23B-A3B-MXFP4_MOE-GGUF

NaNK

—

529

Kimi-VL-A3B-Thinking-2506-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Kimi-VL-A3B-Thinking-2506 Original model: https://huggingface.co/moonshotai/Kimi-VL-A3B-Thinking-2506

NaNK

—

527

Qwen3.5-397B-A17B-MXFP4_MOE-GGUF

NaNK

—

511

Qwen3-Next-80B-A3B-Instruct-MXFP4_MOE-GGUF

This is a MXFP4 quant of Qwen3-Next-80B-A3B-Instruct The context has been extended from 256k to 1M, with YaRN as seen on the repo To enable it, run llama.cpp with options like: `--ctx-size 0 --rope-scaling yarn --rope-scale 4` ctx-size 0 sets it to 1M context, else set a smaller number like 524288 for 512k You can use also as normal if you don't want the extended context.

NaNK

—

488

Huihui-Ring-mini-2.0-abliterated-MXFP4_MOE-GGUF

—

470

Huihui-Tongyi-DeepResearch-30B-A3B-abliterated-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Huihui-Tongyi-DeepResearch-30B-A3B-abliterated Original model: https://huggingface.co/huihui-ai/Huihui-Tongyi-DeepResearch-30B-A3B-abliterated

NaNK

—

467

Qwen3-Next-80B-A3B-Thinking-MXFP4_MOE-GGUF

This is a MXFP4 quant of Qwen3-Next-80B-A3B-Thinking The context has been extended from 256k to 1M, with YaRN as seen on the repo To enable it, run llama.cpp with options like: `--ctx-size 0 --rope-scaling yarn --rope-scale 4` ctx-size 0 sets it to 1M context, else set a smaller number like 524288 for 512k You can use also as normal if you don't want the extended context.

NaNK

—

435

SmallThinker-21B-A3B-Instruct-MXFP4_MOE-GGUF

NaNK

—

426

Llama-4-Scout-17B-16E-Instruct-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Llama-4-Scout-17B-16E-Instruct Model quantized with BF16 GGUF's from: https://huggingface.co/unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF Original model: https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct

NaNK

base_model:meta-llama/Llama-4-Scout-17B-16E-Instruct

426

Qwen3 30B A3B YOYO V4 MXFP4 MOE GGUF

This is a MXFP4MOE quantization of the model Qwen3-30B-A3B-YOYO-V4 Original model: https://huggingface.co/YOYO-AI/Qwen3-30B-A3B-YOYO-V4

NaNK

—

421

Qwen3-Coder-30B-A3B-Instruct-1M-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Qwen3-Coder-30B-A3B-Instruct-1M Original model: https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-1M-GGUF Also added an imatrix version, based on the imatrix from unsloth. This is the version from unloth that has expanded the context size from 256k to 1M.

NaNK

—

412

Qwen3-Coder-30B-A3B-Instruct-MXFP4_MOE-GGUF

NaNK

—

386

MiniMax-M2-REAP-139B-A10B-MXFP4_MOE-GGUF

NaNK

—

371

DavidAU-Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B Original model: https://huggingface.co/DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B

NaNK

base_model:DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B

343

INTELLECT-3-MXFP4_MOE-GGUF

NaNK

—

321

Huihui-Qwen3-VL-2B-Thinking-abliterated-GGUF

NaNK

—

315

The-Philosopher-Zephyr-7B-GGUF

These are quantizations of the model The-Philosopher-Zephyr-7B Original model: https://huggingface.co/Hypersniper/ThePhilosopherZephyr7B It's an older model back from 2023, based on the older Mistral 7B. Why quantize it now in 2025? Its so old! Well, I'm experimenting with importance matrices, and I used the text\en\large set stitched together with various philosophical stuff. Turns out it's quite fun!

NaNK

license:apache-2.0

309

Huihui-Granite-4.0-H-Tiny-abliterated-MXFP4_MOE-GGUF

—

306

TheDrummer-GLM-Steam-106B-A12B-v1-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model GLM-Steam-106B-A12B-v1 Original model: https://huggingface.co/TheDrummer/GLM-Steam-106B-A12B-v1

NaNK

—

306

Qwen3-Coder-REAP-246B-A35B-MXFP4_MOE-GGUF

NaNK

—

302

Huihui-Ling-mini-2.0-abliterated-MXFP4_MOE-GGUF

—

301

ERNIE-4.5-21B-A3B-PT-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model ERNIE-4.5-21B-A3B-PT Model quantized with BF16 GGUF's from: https://huggingface.co/unsloth/ERNIE-4.5-21B-A3B-PT-GGUF Original model: https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-PT

NaNK

—

277

P1-30B-A3B-MXFP4_MOE-GGUF

NaNK

—

275

LFM2-8B-A1B-MXFP4_MOE-GGUF

NaNK

—

274

Qwen3-Coder-Next-REAM-MXFP4_MOE-GGUF

—

267

LLaDA-MoE-7B-A1B-Instruct-TD-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model LLaDA-MoE-7B-A1B-Instruct-TD: A specialized instruction-tuned model, further optimized for accelerated inference using Trajectory Distillation. Also created a quant with an imatrix from mradermacher. Original model: https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct-TD

NaNK

—

252

Ring-flash-2.0-MXFP4_MOE-GGUF

NaNK

—

250

Qwen3-30B-A3B-Mixture-2507-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Qwen3-30B-A3B-Mixture-2507 Original model: https://huggingface.co/YOYO-AI/Qwen3-30B-A3B-Mixture-2507

NaNK

—

237

Huihui-Ling-mini-2.0-abliterated-i1-GGUF

—

218

Huihui-MoE-60B-A3B-abliterated-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Huihui-MoE-60B-A3B-abliterated Model quantized with F16 GGUF’s from: https://huggingface.co/DevQuasar/huihui-ai.Huihui-MoE-60B-A3B-abliterated-GGUF Original model: https://huggingface.co/DevQuasar/huihui-ai.Huihui-MoE-60B-A3B-abliterated-GGUF

NaNK

—

208

Granite-4.0-H-Small-MXFP4_MOE-GGUF

—

205

Qwen3-VL-30B-A3B-Instruct-1M-MXFP4_MOE-GGUF

These are quantizations of the model Qwen3-VL-30B-A3B-Instruct Original model: https://huggingface.co/unsloth/Qwen3-VL-30B-A3B-Instruct This is the 1M context length variant from unsloth, with their imatrix applied to it.

NaNK

—

195

Mixtral-8x7B-Instruct-v0.1-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Mixtral-8x7B-Instruct-v0.1 Original model: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1

NaNK

—

193

Huihui-Granite-4.0-H-Micro-abliterated-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Huihui-Granite-4.0-H-Micro-abliterated Original model: https://huggingface.co/huihui-ai/Huihui-granite-4.0-h-Micro-abliterated

—

193

Nemotron-Cascade-14B-Thinking-MXFP4-GGUF

NaNK

—

187

AI21-Jamba-Mini-1.7-MXFP4_MOE-GGUF

NaNK

—

186

Qwen3 Yoyo V4 42B A3B Thinking TOTAL RECAL MXFP4 MOE GGUF

This is a MXFP4MOE quantization of the model Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL Original model: https://huggingface.co/DavidAU/Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL

NaNK

—

175

Huihui-MoE-23B-A4B-abliterated-MXFP4_MOE-GGUF

NaNK

—

173

PromptCoT-2.0-SelfPlay-30B-A3B-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model PromptCoT-2.0-SelfPlay-30B-A3B Original model: https://huggingface.co/xl-zhao/PromptCoT-2.0-SelfPlay-30B-A3B

NaNK

—

172

Ring-mini-2.0-MXFP4_MOE-GGUF

NaNK

—

171

Ling-Coder-lite-MXFP4_MOE-GGUF

—

155

Qwen3-VL-30B-A3B-Thinking-1M-MXFP4_MOE-GGUF

These are quantizations of the model Qwen3-VL-30B-A3B-Thinking Original model: https://huggingface.co/unsloth/Qwen3-VL-30B-A3B-Thinking This is the 1M context length variant from unsloth, with their imatrix applied to it.

NaNK

—

152

SimpleChat-30BA3B-V3-MXFP4_MOE-GGUF

NaNK

—

150

Ling-mini-2.0-MXFP4_MOE-GGUF

NaNK

—

132

dolphin-2.7-mixtral-8x7b-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model dolphin-2.7-mixtral-8x7b Original model: https://huggingface.co/dphn/dolphin-2.7-mixtral-8x7b

NaNK

—

130

GroveMoE-Inst-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model GroveMoE-Inst Original model: https://huggingface.co/inclusionAI/GroveMoE-Inst

—

127

MiniMax-M2-REAP-172B-A10B-MXFP4_MOE-GGUF

NaNK

—

126

Granite-4.0-H-Tiny-MXFP4_MOE-GGUF

—

125

Ling-Mini-2.0-Identity-GGUF

This is a MXFP4MOE quantization of the model Ling-Mini-2.0-Identity Original model: https://huggingface.co/qingy2024/Ling-Mini-2.0-Identity

—

122

aquif-3.5-A4B-Think-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model aquif-3.5-A4B-Think Original model: https://huggingface.co/aquif-ai/aquif-3.5-A4B-Think

NaNK

—

120

Ling-lite-1.5-2507-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Ling-lite-1.5-2507 Original model: https://huggingface.co/inclusionAI/Ling-lite-1.5-2507

NaNK

—

115

SERA-32B-GGUF

NaNK

—

108

Huihui-MiroThinker-v1.0-30B-abliterated-MXFP4_MOE-GGUF

NaNK

—

105

MiniMax-M2-REAP-162B-A10B-MXFP4_MOE-GGUF

NaNK

—

101

SmallThinker-4B-A0.6B-Instruct-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model SmallThinker-4BA0.6B-Instruct A quantization with imatrix is also included Original model: https://huggingface.co/PowerInfer/SmallThinker-4BA0.6B-Instruct

NaNK

—

Phi-mini-MoE-instruct-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Phi-mini-MoE-instruct Model quantized with F16 GGUF's from: https://huggingface.co/gabriellarson/Phi-mini-MoE-instruct-GGUF Original model: https://huggingface.co/microsoft/Phi-mini-MoE-instruct

—

aquif-3-moe-17B-A2.8B-Think-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model aquif-3-moe-17B-A2.8B-Think Original model: https://huggingface.co/aquif-ai/aquif-3-moe-17B-A2.8B-Think

NaNK

—

LLaDA-MoE-7B-A1B-Instruct-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model LLaDA-MoE-7B-A1B-Instruct Original model: https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct

NaNK

—

ERNIE-4.5-21B-A3B-Thinking-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model ERNIE-4.5-21B-A3B-Thinking Model quantized with BF16 GGUF's from: https://huggingface.co/unsloth/ERNIE-4.5-21B-A3B-Thinking-GGUF Original model: https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Thinking

NaNK

—

Moonlight-16B-A3B-Instruct-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Moonlight-16B-A3B-Instruct Original model: https://huggingface.co/moonshotai/Moonlight-16B-A3B-Instruct

NaNK

—

Huihui-MoE-4.8B-A1.7B-abliterated-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Huihui-MoE-4.8B-A1.7B-abliterated Model quantized with F16 GGUF's from: https://huggingface.co/DevQuasar/huihui-ai.Huihui-MoE-4.8B-A1.7B-abliterated-GGUF Original model: https://huggingface.co/huihui-ai/Huihui-MoE-4.8B-A1.7B-abliterated

NaNK

—

Pristine-8B-A1B-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Pristine-8B-A1B Also a imatrix quantization is included Original model: https://huggingface.co/qingy2024/Pristine-8B-A1B

NaNK

—

Huihui-MoE-12B-A4B-abliterated-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Huihui-MoE-12B-A4B-abliterated Original model: https://huggingface.co/huihui-ai/Huihui-MoE-12B-A4B-abliterated

NaNK

—

grok-2-MXFP4_MOE-GGUF

NaNK

—

OLMoE-1B-7B-0125-Instruct-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model OLMoE-1B-7B-0125-Instruct Model quantized with F16 GGUF’s from: https://huggingface.co/DevQuasar/allenai.OLMoE-1B-7B-0125-Instruct-GGUF Original model: https://huggingface.co/allenai/OLMoE-1B-7B-0125-Instruct

NaNK

—

ERNIE-4.5-300B-A47B-PT-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model ERNIE-4.5-300B-A47B-PT Original model: https://huggingface.co/baidu/ERNIE-4.5-300B-A47B-PT This model's GGUF's have been removed, in order to conserve my repos use of space. If you want it, just message me, and I will make it available on demand.

NaNK

—

Phi-3.5-MoE-instruct-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model Phi-3.5-MoE-instruct Original model: https://huggingface.co/microsoft/Phi-3.5-MoE-instruct

—

DeepSeek-V3.1-Terminus-MXFP4_MOE-GGUF

This is a MXFP4MOE quantization of the model DeepSeek-V3.1-Terminus Model quantized with BF16 GGUF's from: https://huggingface.co/unsloth/DeepSeek-V3.1-Terminus-GGUF Original model: https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus

—