YOYO-AI

121 models • 12 total models in database

Sort by:

Qwen3-30B-A3B-CoderThinking-YOYO-linear-Q4_K_M-GGUF

YOYO-AI/Qwen3-30B-A3B-CoderThinking-YOYO-linear-Q4KM-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-30B-A3B-CoderThinking-YOYO-linear` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-30B-A3B-Mixture-2507-Q4_K_M-GGUF

YOYO-AI/Qwen3-30B-A3B-Mixture-2507-Q4KM-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-30B-A3B-Mixture-2507` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-Coder-30B-A3B-Instruct-480B-Distill-V2-Fp32-mlx-fp16

YOYO-AI/Qwen3-Coder-30B-A3B-Instruct-480B-Distill-V2-Fp32-mlx-fp16 The Model YOYO-AI/Qwen3-Coder-30B-A3B-Instruct-480B-Distill-V2-Fp32-mlx-fp16 was converted to MLX format from BasedBase/Qwen3-Coder-30B-A3B-Instruct-480B-Distill-V2-Fp32 using mlx-lm version 0.26.4.

license:apache-2.0

Qwen3 30B A3B YOYO V4

> Leveraging our novel merging approach, we can seamlessly integrate instruction, reasoning, and code models into a single, high-performing unified model in just one step. Model Highlights: Parameter Settings: > [!TIP] > `Temperature=0.7`, `TopP=0.8`, `TopK=20`,`MinP=0`. Problem Setting Objective: Merge 𝐾 fine-tuned models with identical tensor names and shapes into a single model whose parameters 𝜃⋆ lie at the robust center of the 𝐾 parameter sets. Per-Tensor Formulation For a given tensor name, each model provides a point 𝑥ᵢ ∈ ℝⁿ (flattened). We seek a robust center 𝜃⋆ ∈ ℝⁿ. Arithmetic Mean: $$a = \frac{1}{K} \sum{i=1}^{K} xi$$ Robust but ignores vector magnitude coupling; computed elementwise across coordinates. Centered Linear Average: $$\theta^{(0)} = \frac{a + m}{2}$$ This blends efficiency and robustness without tuning, offering a strong seed for iterative robust estimators. Objective Function: $$\theta^{\star} = \arg\min{\theta \in \mathbb{R}^n} \sum{i=1}^{K} \|\theta - xi\|2$$ This is the multivariate analogue of the median, robust to outliers in the Euclidean geometry of parameters. $$wi^{(t)} = \frac{1}{\max(\|\theta^{(t)} - xi\|2, \varepsilon)}$$ Iteration Step: $$\theta^{(t+1)} = \frac{\sum{i=1}^{K} wi^{(t)} xi}{\sum{i=1}^{K} wi^{(t)}}$$ Convergence Criterion: Stop when the relative change is below 𝜀: $$\frac{\|\theta^{(t+1)} - \theta^{(t)}\|2}{\max(\|\theta^{(t)}\|2, 1)} \leq \varepsilon$$

license:apache-2.0

Qwen3-30B-A3B-YOYO-V4-Q4_K_M-GGUF

YOYO-AI/Qwen3-30B-A3B-YOYO-V4-Q4KM-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-30B-A3B-YOYO-V4` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill-V2-FP32-mlx-fp16

YOYO-AI/Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill-V2-FP32-mlx-fp16 The Model YOYO-AI/Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill-V2-FP32-mlx-fp16 was converted to MLX format from BasedBase/Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill-V2-FP32 using mlx-lm version 0.26.4.

license:apache-2.0

Qwen3-30B-A3B-YOYO-V4-Q8_0-GGUF

YOYO-AI/Qwen3-30B-A3B-YOYO-V4-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-30B-A3B-YOYO-V4` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

ZYH-LLM-Qwen2.5-14B-V5

license:apache-2.0

Qwen3 30B A3B Deepseek Distill Instruct 2507

> Using the AcreeFusion merging method, we transferred the knowledge of Deepseek-V3.1 from the distilled reasoning model to the instruction model. Model Highlights: - Highest precision: `dtype: float32` + `outdtype: bfloat16` Parameter Settings: > [!TIP] > `Temperature=0.7`, `TopP=0.8`, `TopK=20`,`MinP=0`. Configuration: The following YAML configuration was used to produce this model:

license:apache-2.0

Qwen3-8B-YOYO-V2-Hybrid

> Enhance the performance of Qwen3-8B by merging powerful reasoning models without compromising the effectiveness of the \nothink tag! Model Highlights: Parameter Settings: Thinking Mode: > [!NOTE] > `Temperature=0.6`, `TopP=0.95`, `TopK=20`,`MinP=0`. Non-Thinking Mode: > [!TIP] > `Temperature=0.7`, `TopP=0.8`, `TopK=20`,`MinP=0`. Step1: Merge Two Hybrid Models - Leverage the advantages of the two hybrid models. Step2: Merge High-Performance Reasoning Models with Hybrid Models - Maximize the proportion of reasoning models on the premise that the \nothink tag remains effective. Step3: Unify the Enhanced Hybrid Modes - Merge the two models into the base model using the della merging method to make the model more versatile and stable. - We use the chat template of Qwen3-8B.

license:apache-2.0

Qwen3 30B A3B CoderThinking YOYO Linear

- Highest precision: `dtype: float32` + `outdtype: bfloat16` Parameter Settings: > [!NOTE] > `Temperature=0.6`, `TopP=0.95`, `TopK=20`,`MinP=0`. Configuration: The following YAML configuration was used to produce this model:

license:apache-2.0

Qwen3-EZO-8B-YOYO-karcher-128K

license:apache-2.0

Qwen3-30B-A3B-YOYO

license:apache-2.0

Qwen3-30B-A3B-Mixture-2507

- Highest precision: `dtype: float32` + `outdtype: bfloat16` Parameter Settings: > [!NOTE] > `Temperature=0.7`, `TopP=0.8`, `TopK=20`,`MinP=0`. Configuration: The following YAML configuration was used to produce this model:

license:apache-2.0

Qwen2.5-14B-YOYO-Average

> We have used the Karcher merging method to average five of the most representative Qwen2.5 14B derivative models, in commemoration of the efforts made by the open-source community for the Qwen2.5 14B model. Merge Method This model was merged using the Karcher Mean merge method. Models Merged The following models were included in the merge: Qwen/Qwen2.5-14B-Instruct Qwen/Qwen2.5-14B-Instruct-1M arcee-ai/Virtuoso-Small-v2 deepcogito/cogito-v1-preview-qwen-14B deepseek-ai/DeepSeek-R1-Distill-Qwen-14B Configuration: The following YAML configuration was used to produce this model:

license:apache-2.0

Qwen2.5-14B-YOYO-V4-p2

License: Apache 2.0. Language: English.

license:apache-2.0

Qwen3-EZO-8B-YOYO-slerp

license:apache-2.0

Qwen2.5-14B-YOYO-V4-p1

License: Apache 2.0, Language: English.

license:apache-2.0

Qwen3-EZO-8B-YOYO-slerp-128K

license:apache-2.0

Qwen3-8B-YOYO-slerp

license:apache-2.0

ZYH-LLM-Qwen2.5-14B-V4

This model is licensed under Apache 2.0 and supports the English language.

license:apache-2.0

Qwen3-8B-YOYO-nuslerp-plus

license:apache-2.0

Qwen3-8B-YOYO-nuslerp-plus-128K

license:apache-2.0

Qwen3-EZO-8B-YOYO-nuslerp

license:apache-2.0

Qwen3-EZO-8B-YOYO-nuslerp-plus

license:apache-2.0

Qwen3-8B-YOYO-slerp-128K

license:apache-2.0

Qwen3-EZO-8B-YOYO-nuslerp-128K

license:apache-2.0

Qwen3-8B-YOYO-karcher

license:apache-2.0

Qwen3-EZO-8B-YOYO-karcher

license:apache-2.0

Qwen2.5-14B-YOYO-V4-p3

license:apache-2.0

Qwen3-8B-YOYO

This is a merge of pre-trained language models created using mergekit. This model was merged using the DELLA merge method using Qwen/Qwen3-8B-Base as a base. The following models were included in the merge: Qwen/Qwen3-8B The following YAML configuration was used to produce this model:

license:apache-2.0

Qwen3-14B-YOYO

license:apache-2.0

Qwen3-8B-YOYO-nuslerp

license:apache-2.0

QwQ-Coder-instruct

license:apache-2.0

Qwen3-EZO-8B-YOYO-nuslerp-128K-Q8_0-GGUF

YOYO-O1-14B-V2

Combined the most top-notch 14B inference model and code model in the entire open-source community. This model was merged using the SCE merge method using arcee-ai/Virtuoso-Small-v2 as a base. The following models were included in the merge: deepcogito/cogito-v1-preview-qwen-14B Zhihu-ai/Zhi-Create-DSR1-14B agentica-org/DeepCoder-14B-Preview FractalAIResearch/Fathom-R1-14B The following YAML configuration was used to produce this model:

license:apache-2.0

Qwen3-4B-YOYO

This is a merge of pre-trained language models created using mergekit. This model was merged using the DELLA merge method using Qwen/Qwen3-4B-Base as a base. The following models were included in the merge: Qwen/Qwen3-4B The following YAML configuration was used to produce this model:

license:apache-2.0

Qwen3-8B-YOYO-nuslerp-128K

license:apache-2.0

Qwen2.5-7B-it-restore

License: Apache 2.0. Language: English.

license:apache-2.0

Qwen3-EZO-8B-YOYO-nuslerp-plus-128K

license:apache-2.0

Qwen3-8B-YOYO-karcher-128K

license:apache-2.0

ZYH-LLM-Qwen2.5-14B-V3-GGUF

upgraded version： The fourth-generation model of ZYH-LLM-Qwen2.5 has been released! This is the third-generation model of the ZYH-LLM series. It employs a large amount of model merging techniques, aiming to provide a powerful and unified 14-billion-parameter model, laying a solid foundation for further model merging and model fine-tuning. imatrix quants： https://huggingface.co/mradermacher/ZYH-LLM-Qwen2.5-14B-V3-i1-GGUF

license:apache-2.0

Qwen3-EZO-8B-YOYO-karcher-128K-Q8_0-GGUF

YOYO-AI/Qwen3-EZO-8B-YOYO-karcher-128K-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-EZO-8B-YOYO-karcher-128K` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-30B-A3B-YOYO-V3

license:apache-2.0

Qwen3-EZO-8B-YOYO-nuslerp-plus-Q8_0-GGUF

YOYO-AI/Qwen3-EZO-8B-YOYO-nuslerp-plus-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-EZO-8B-YOYO-nuslerp-plus` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen2.5-14B-YOYO-Average-Q8_0-GGUF

Qwen3-30B-A3B-YOYO-V3-Q4_K_M-GGUF

YOYO-AI/Qwen3-30B-A3B-YOYO-V3-Q4KM-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-30B-A3B-YOYO-V3` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-8B-YOYO-nuslerp-plus-128K-Q8_0-GGUF

Qwen3-8B-YOYO-karcher-Q8_0-GGUF

YOYO-AI/Qwen3-8B-YOYO-karcher-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-8B-YOYO-karcher` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

EZO-QwQ-32B-Q4_K_M-GGUF

Qwen3-EZO-8B-YOYO-nuslerp-plus-128K-Q8_0-GGUF

YOYO-AI/Qwen3-EZO-8B-YOYO-nuslerp-plus-128K-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-EZO-8B-YOYO-nuslerp-plus-128K` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

ZYH-LLM-Qwen2.5-14B-V2

This model is licensed under the Apache 2.0 license and supports the English language.

license:apache-2.0

Qwen2.5-32B-YOYO-V2-Q4_K_M-GGUF

YOYO-AI/Qwen2.5-32B-YOYO-V2-Q4KM-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen2.5-32B-YOYO-V2` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-8B-YOYO-nuslerp-plus-Q8_0-GGUF

ZYH-LLM-Qwen2.5-14B-V3

This model is licensed under the Apache 2.0 license and supports the English language.

license:apache-2.0

Qwen2.5-32B-YOYO-MIX-Q4_K_M-GGUF

Qwen3-8B-YOYO-nuslerp-Q8_0-GGUF

YOYO-AI/Qwen3-8B-YOYO-nuslerp-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-8B-YOYO-nuslerp` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-8B-YOYO-slerp-Q8_0-GGUF

YOYO-AI/Qwen3-8B-YOYO-slerp-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-8B-YOYO-slerp` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen2.5-14B-YOYO-V3

license:apache-2.0

QwQ-instruct-32B-Q4_K_M-GGUF

Qwen3-8B-YOYO-nuslerp-128K-Q8_0-GGUF

YOYO-AI/Qwen3-8B-YOYO-nuslerp-128K-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-8B-YOYO-nuslerp-128K` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-EZO-8B-YOYO-nuslerp-Q8_0-GGUF

Qwen3-EZO-8B-YOYO-slerp-Q8_0-GGUF

Qwen3-EZO-8B-YOYO-slerp-128K-Q8_0-GGUF

Qwen3-8B-YOYO-slerp-128K-Q8_0-GGUF

YOYO-AI/Qwen3-8B-YOYO-slerp-128K-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-8B-YOYO-slerp-128K` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen2.5-14B-YOYO-V4

License: Apache 2.0. Language: English.

license:apache-2.0

Qwen2.5-14B-YOYO

This is a merge of pre-trained language models created using mergekit. This model was merged using the DELLA merge method using Qwen/Qwen2.5-14B as a base. The following models were included in the merge: Qwen/Qwen2.5-14B-instruct The following YAML configuration was used to produce this model:

Qwen2.5-14B-YOYO-V5

license:apache-2.0

Qwen3-8B-YOYO-karcher-128K-Q8_0-GGUF

YOYO-AI/Qwen3-8B-YOYO-karcher-128K-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-8B-YOYO-karcher-128K` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen2.5-Coder-3B-YOYO

Qwen2.5-14B-YOYO-super

license:apache-2.0

Qwen3-30B-A3B-YOYO-V2

> This is the initial unified version of the Qwen3-30B-A3B series models.As more fine-tuned models emerge and merging methods are applied, we will further improve it. Stay tuned! Model Highlights: Parameter Settings: > [!TIP] > `Temperature=0.7`, `TopP=0.8`, `TopK=20`,`MinP=0`. Step1: Merge Code Model with Instruction & Thinking Models Separately - Adopt the nuslerp method to improve model absorption rate. - Set a merging ratio of 9:1 to prevent capability degradation caused by an excessively high proportion of the code model. Step2: Merge Code Instruction & Code Thinking Models into Base Model Together - Merge the two models into the base model using the della merging method to make the model more versatile and stable. - Since the merged model is more similar to the instruction model, we use the chat template of the Qwen3-30B-A3B-Instruct-2507. Step3: Further Extend Context Length - By referring to the config1m.json of Qwen3-30B-A3B-Instruct-2507, we modified the config.json of the merged model and extended the maximum context length to 1M.

license:apache-2.0

Qwen2.5-7B-YOYO-super

license:apache-2.0

Qwen2.5-14B-it-restore

License: Apache 2.0. Language: English.

license:apache-2.0

YOYO-O1-32B

Qwen2.5-7B-YOYO

Qwen2.5-Coder-7B-YOYO

Qwen2.5-32B-YOYO-V2

license:apache-2.0

QwQ-openhands-coder-32B

Qwen2.5-14B-YOYO-V6-test2-Q8_0-GGUF

YOYO-AI/Qwen2.5-14B-YOYO-V6-test2-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen2.5-14B-YOYO-V6-test2` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-EZO-8B-YOYO-karcher-Q8_0-GGUF

YOYO-AI/Qwen3-EZO-8B-YOYO-karcher-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-EZO-8B-YOYO-karcher` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-8B-YOYO-V2-Hybrid-Q8_0-GGUF

YOYO-AI/Qwen3-8B-YOYO-V2-Hybrid-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-8B-YOYO-V2-Hybrid` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

Qwen3-30B-A3B-YOYO-V3-Q8_0-GGUF

YOYO-AI/Qwen3-30B-A3B-YOYO-V3-Q80-GGUF This model was converted to GGUF format from `YOYO-AI/Qwen3-30B-A3B-YOYO-V3` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

YOYO-O1-32B-V2

Qwen2.5-Coder-1.5B-YOYO

Qwen2.5-0.5B-YOYO

YOYO-O1-32B-V3

Qwen2.5-14B-Fusion-Q8_0-GGUF

Qwen2.5-14B-YOYO-V6-test2

This is a merge of pre-trained language models created using mergekit. This model was merged using the Karcher Mean merge method using mergekit-community/Qwen2.5-14B-della-1M-dpo as a base. The following models were included in the merge: mergekit-community/Qwen2.5-14B-della-V6-dpo mergekit-community/Qwen2.5-14B-della-Nova-dpo agentica-org/DeepCoder-14B-Preview mergekit-community/Qwen2.5-14B-della-base-dpo Zhihu-ai/Zhi-writing-dsr1-14b mergekit-community/Qwen2.5-14B-della-v2-dpo mergekit-community/Qwen2.5-14B-della-code The following YAML configuration was used to produce this model:

Qwen2.5-32B-YOYO-reasoning-v3-Q4_K_M-GGUF

Qwen3-30B-A3B-YOYO-V2-Q8_0-GGUF

Qwen2.5-Coder-32B-YOYO

Qwen2.5-14B-YOYO-V2

Qwen2.5-3B-YOYO

Qwen2.5-1.5B-YOYO

Qwen3-30B-A3B-YOYO-V6

license:apache-2.0

Qwen2.5-32B-YOYO

EZO-QwQ-32B

Qwen2.5-14B-YOYO-GGUF

license:apache-2.0

ZYH-LLM-Qwen2.5-14B-V2-GGUF

license:apache-2.0

Qwen2.5-32B-YOYO-reasoning

This is a merge of pre-trained language models created using mergekit. This model was merged using the Model Stock merge method using Qwen/Qwen2.5-32B-Instruct as a base. The following models were included in the merge: maldv/Awqward2.5-32B-Instruct qihoo360/Light-R1-32B Rombo-Org/Rombo-LLM-V3.0-Qwen-32b maldv/Qwentile2.5-32B-Instruct Qwen/QwQ-32B OpenPipe/Deductive-Reasoning-Qwen-32B The following YAML configuration was used to produce this model:

QwQ-Sky-T1-Med-32B

QwQ-coder-32B-plus

Qwen2.5-32B-YOYO-stock

QwQ-coder-32B

This is a merge of pre-trained language models created using mergekit. This model was merged using the SCE merge method using Qwen/Qwen2.5-Coder-32B as a base. The following models were included in the merge: Qwen/QwQ-32B Qwen/Qwen2.5-Coder-32B-Instruct The following YAML configuration was used to produce this model:

ZYH-LLM-Qwen2.5-14B

This model is licensed under the Apache 2.0 license and supports the English language.

license:apache-2.0

Qwen2.5-Coder-14B-YOYO

This is a merge of pre-trained language models created using mergekit. This model was merged using the DELLA merge method using Qwen/Qwen2.5-Coder-14B as a base. The following models were included in the merge: Qwen/Qwen2.5-Coder-14B-instruct The following YAML configuration was used to produce this model:

Qwen2.5-32B-YOYO-MIX

license:apache-2.0

YOYO-O1-14B

license:apache-2.0

Qwen2.5-14B-YOYO-stock

Qwen3-30B-A3B-YOYO-Thinking-Chimera

license:apache-2.0

QwQ-instruct-32B

Qwen2.5-72B-YOYO

Qwen2.5-Coder-0.5B-YOYO

QwQ-32B-YOYO

Qwen2.5-14B-Fusion

Qwen2.5-14B-Fusion-1M

Qwen2.5-14B-YOYO-V1.5

QwQ-Olympic-coder-32B

Qwen2.5-14B-YOYO-karcher-test

From the preliminary test results, the effect is really excellent!!! This is definitely a very promising method for model merging! The current optimal formula ratio is: instruction: reasoning: code = 6:2:1 This is a merge of pre-trained language models created using mergekit. This model was merged using the Karcher Mean merge method using Qwen/Qwen2.5-14B-Instruct as a base. instruction:(6) Qwen/Qwen2.5-14B-Instruct Qwen/Qwen2.5-14B-Instruct-1M huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2 SicariusSicariiStuff/ImpishQWEN14B-1M tanliboy/lambda-qwen2.5-14b-dpo-test reasoning:(2) Zhihu-ai/Zhi-writing-dsr1-14b deepseek-ai/DeepSeek-R1-Distill-Qwen-14B The following YAML configuration was used to produce this model:

license:apache-2.0

Qwen2.5-32B-YOYO-reasoning-v3