theprint

126 models • 18 total models in database

Sort by:

ReWiz-Nemo-12B-Instruct-GGUF

- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

license:apache-2.0

1,757

MathTutor-7B-GGUF

Fine-tuned on theprint/CoT-Explaining-Math for using `think` and `answer` tags for reasoning.

NaNK

license:mit

352

Genuine-Gemma3-12B-GGUF

- Developed by: theprint - Finetuned from model : unsloth/gemma-3-12b-it-unsloth-bnb-4bit This gemma3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

license:mit

299

CreativeWriter-Llama3.2-3B-GGUF

NaNK

base_model:theprint/CreativeWriter-Llama3.2-3B

290

PositiveDetox-Qwen2.5-14B-GGUF

This model was finetuned on the Positive Detox data set, designed specifically to reduce or eliminate toxic positivity in AI responses.

NaNK

license:mit

279

ReWiz-Llama-3.1-8B-v2

NaNK

llama

253

ReWiz-Qwen-2.5-14B

NaNK

license:apache-2.0

235

TextSynth-8B-GGUF

NaNK

llama

232

Code-Llama-Bagel-8B

NaNK

llama

218

CleverBoi-Llama-3.1-8B-Instruct

NaNK

llama

206

PositiveDetox-Qwen3-4B-GGUF

NaNK

license:mit

206

phi-3-mini-4k-python

NaNK

license:apache-2.0

180

CleverBoi-7B-v3

NaNK

license:apache-2.0

180

Coma-7B-GGUF

NaNK

license:apache-2.0

178

ReWiz-7B

NaNK

license:apache-2.0

164

Coma-3B-GGUF

NaNK

license:apache-2.0

160

CleverBoi-Nemo-12B-v2

NaNK

license:apache-2.0

136

ReWiz-Qwen2.5-7B

- Before both User and Assistant: `\n\n` - Before User: `### Instruction:\n` - Before Assistant: `### Response:\n` - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-7b-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

license:apache-2.0

129

Conversely-Mistral-7B

NaNK

license:apache-2.0

125

ReWiz-Llama-3.1-8B

NaNK

llama

121

CleverBoi-Gemma-2-9B-v2

NaNK

license:apache-2.0

120

Boptruth-NeuralMonarch-7B

NaNK

license:apache-2.0

117

CleverBoi-Llama-3.1-8B-v2

NaNK

llama

116

CleverBoi-Llama-3.2-3B-Instruct

NaNK

llama

113

TextSynth-Phi4-14B-GGUF

NaNK

llama

110

CleverBoi-Llama-3.1-8B-Python

NaNK

llama

109

ThinkMix-Qwen2.5-1.5B-GGUF

NaNK

llama.cpp

107

MathTutor-7B

Fine-tuned on theprint/CoT-Explaining-Math for using `think` and `answer` tags for reasoning. Find various GGIF-quants of this model at theprint/MathTutor-7B-GGUF - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-7b-instruct-unsloth-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

license:apache-2.0

102

Llama-3.2-3B-VanRossum

NaNK

llama

Empathetic-Llama-3.2-3B-Instruct-GGUF

Quantized GGUF versions of Empathetic-Llama-3.2-3B-Instruct for use with llama.cpp and other GGUF-compatible inference engines. - Base model: meta-llama/Llama-3.2-3B-Instruct - Fine-tuned model: theprint/Empathetic-Llama-3.2-3B-Instruct - Quantized by: theprint - `Empathetic-Llama-3.2-3B-Instruct-f16.gguf` (6135.6 MB) - 16-bit float (original precision, largest file) - `Empathetic-Llama-3.2-3B-Instruct-q3km.gguf` (1609.0 MB) - 3-bit quantization (medium quality) - `Empathetic-Llama-3.2-3B-Instruct-q4km.gguf` (1925.8 MB) - 4-bit quantization (medium, recommended for most use cases) - `Empathetic-Llama-3.2-3B-Instruct-q5km.gguf` (2214.6 MB) - 5-bit quantization (medium, good quality) - `Empathetic-Llama-3.2-3B-Instruct-q6k.gguf` (2521.4 MB) - 6-bit quantization (high quality) - `Empathetic-Llama-3.2-3B-Instruct-q80.gguf` (3263.4 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

ReWiz-Gemma2-9B

NaNK

license:apache-2.0

Hemispheres-7B-Final

NaNK

license:apache-2.0

CleverBoi-7B-v2

NaNK

license:apache-2.0

mistral-7b-cthulhu

NaNK

license:apache-2.0

Hemispheres-Qwen2.5-3B-Right

NaNK

license:apache-2.0

Llama-CP1-8B-GGUF

NaNK

llama

CleverBoi-Nemo-12B

NaNK

license:apache-2.0

Hemispheres-Llama3.2-3B-Right

NaNK

llama

tinyllama_alpaca_cthulhu_small

NaNK

llama

CleverBoi-Gemma-2-9B

NaNK

license:apache-2.0

VanRossum-Qwen2.5-Coder-3B

This model has been trained for 1 epoch on the VanRossum dataset. The VanRossum dataset is all Python! I used DataMix to combine a handful of highly rated Python-centric datasets, to get a sampling of each and create something new. This data set has 80,000 entries and is named after Guido Van Rossum, the man who invented Python back in 1991. See the VanRossum Collection on HF for all things related to this dataset. There are 2 versions of this dataset available on Huggingface. - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/Qwen2.5-Coder-3B-Instruct-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

license:apache-2.0

CleverBoi-Mistral-0.3-7B

NaNK

license:apache-2.0

Genuine-Gemma3-12B

NaNK

license:apache-2.0

ReWiz-Llama-3.2-1B

Half the data was geared towards better reasoning (EvolKit-20k and reasoning-base-20k), the other half will help to de-censor the model (WizardLM data set). - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/llama-3.2-1b-instruct-bnb-4bit This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

llama

Boptruth-Agatha-7B

NaNK

—

GeneralChat-Llama3.2-3B-DPO

NaNK

llama

PositiveDetox-Qwen2.5-14B

NaNK

license:apache-2.0

PyRe-Llama8.1-8B-GGUF

NaNK

llama

ReWiz-Worldbuilder-7B-GGUF

NaNK

license:apache-2.0

Hemispheres-Llama3.2-3B-Combiner

NaNK

llama

Gemma3-Python-22k-1B-GGUF

Quantized GGUF versions of Gemma3-Python-22k-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/Gemma3-Python-22k-1B - Quantized by: theprint - `Gemma3-Python-22k-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `Gemma3-Python-22k-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `Gemma3-Python-22k-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `Gemma3-Python-22k-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `Gemma3-Python-22k-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `Gemma3-Python-22k-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

DevilsAdvocate-1B-GGUF

Quantized GGUF versions of DevilsAdvocate-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/DevilsAdvocate-1B - Quantized by: theprint - `DevilsAdvocate-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

MLF-Llama3.2-3B

NaNK

llama

TiTan-Llama-3.2-1B-GGUF

Quantized GGUF versions of TiTan-Llama-3.2-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: unsloth/Llama-3.2-1B - Fine-tuned model: theprint/TiTan-Llama-3.2-1B - Quantized by: theprint - `TiTan-Llama-3.2-1B-f16.gguf` (2364.7 MB) - 16-bit float (original precision, largest file) - `TiTan-Llama-3.2-1B-q3km.gguf` (658.8 MB) - 3-bit quantization (medium quality) - `TiTan-Llama-3.2-1B-q4km.gguf` (770.3 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Llama-3.2-1B-q5km.gguf` (869.3 MB) - 5-bit quantization (medium, good quality) - `TiTan-Llama-3.2-1B-q6k.gguf` (974.5 MB) - 6-bit quantization (high quality) - `TiTan-Llama-3.2-1B-q80.gguf` (1259.9 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

Hemispheres-Llama3.2-3B-Left

NaNK

llama

Tom-Qwen-7B-Instruct

NaNK

license:apache-2.0

phi-3-mini-4k-gamedev

license:apache-2.0

Hemispheres-7B-Combiner

NaNK

license:apache-2.0

Hemispheres-Qwen2.5-1.5B-Combo

NaNK

license:apache-2.0

CodeThink-8B-GRPO-GGUF

NaNK

llama

ReWiz-Phi-4-14B-GGUF

NaNK

llama

Pythonified-Llama-3.2-3B-Instruct-GGUF

NaNK

llama.cpp

TiTan-Qwen2.5-0.5B-GGUF

NaNK

llama.cpp

WorldBuilder-12B-GGUF

NaNK

license:apache-2.0

Hemispheres-7B-Left

NaNK

license:apache-2.0

Hemispheres-v0.2-Llama3.2-3B-Final

NaNK

llama

TiTan-Gemma3-0.27B-GGUF

Quantized GGUF versions of TiTan-Gemma3-0.27B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-270m-it - Fine-tuned model: theprint/TiTan-Gemma3-0.27B - Quantized by: theprint - `TiTan-Gemma3-0.27B-f16.gguf` (837.7 MB) - 16-bit float (original precision, largest file) - `TiTan-Gemma3-0.27B-q3km.gguf` (320.8 MB) - 3-bit quantization (medium quality) - `TiTan-Gemma3-0.27B-q4km.gguf` (351.4 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Gemma3-0.27B-q5km.gguf` (368.0 MB) - 5-bit quantization (medium, good quality) - `TiTan-Gemma3-0.27B-q6k.gguf` (439.9 MB) - 6-bit quantization (high quality) - `TiTan-Gemma3-0.27B-q80.gguf` (448.0 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

TextSynth-3B-GGUF

NaNK

llama

Conversely-Llama3.1-8B

NaNK

llama

TiTan-Gemma3-4B-GGUF

Quantized GGUF versions of TiTan-Gemma3-4B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-4b-it - Fine-tuned model: theprint/TiTan-Gemma3-4B - Quantized by: theprint - `TiTan-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file) - `TiTan-Gemma3-4B-q3km.gguf` (2276.3 MB) - 3-bit quantization (medium quality) - `TiTan-Gemma3-4B-q4km.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Gemma3-4B-q5km.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality) - `TiTan-Gemma3-4B-q6k.gguf` (3568.1 MB) - 6-bit quantization (high quality) - `TiTan-Gemma3-4B-q80.gguf` (4619.2 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

Nerdish-Llama-3.1-8B

NaNK

llama

Hemispheres-Llama3.2-3B-Final

NaNK

llama

Hemispheres-v0.2-Llama3.2-3B-Combo

NaNK

llama

Zeth-Gemma3-4B-GGUF

Quantized GGUF versions of Zeth-Gemma3-4B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-4b-it - Fine-tuned model: theprint/Zeth-Gemma3-4B - Quantized by: theprint - `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file) - `Zeth-Gemma3-4B-q3km.gguf` (2276.3 MB) - 3-bit quantization (medium quality) - `Zeth-Gemma3-4B-q4km.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases) - `Zeth-Gemma3-4B-q5km.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality) - `Zeth-Gemma3-4B-q6k.gguf` (3568.1 MB) - 6-bit quantization (high quality) - `Zeth-Gemma3-4B-q80.gguf` (4619.2 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

CreativeWriter-Llama3.2-3B

NaNK

llama

TiTan-Gemma3-1B-GGUF

Quantized GGUF versions of TiTan-Gemma3-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/TiTan-Gemma3-1B - Quantized by: theprint - `TiTan-Gemma3-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `TiTan-Gemma3-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `TiTan-Gemma3-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Gemma3-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `TiTan-Gemma3-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `TiTan-Gemma3-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

tinyllama-Evol-Instruct

NaNK

llama

PositiveDetox-Qwen3-4B

NaNK

license:apache-2.0

WorldBuilder-7B-GGUF

NaNK

license:apache-2.0

Rewiz-Tom-7B

A fine-tuned 7B parameter model specialized in reasoning (Rewiz), based on a model that was already finetuned for step-by-step instruction and conversation (Tom). This model is a fine-tuned version of theprint/Tom-Qwen-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: apache-2.0 - Base model: theprint/Tom-Qwen-7B-Instruct - Fine-tuning method: LoRA with rank 128 Conversation, brainstorming, and general instruction following The Rewiz data set is a curated mix of 20,000 reasoning-based entries. - Training epochs: 2 - LoRA rank: 128 - Learning rate: 0.0002 - Batch size: 4 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 Quantized GGUF versions are available in the `gguf/` directory for use with llama.cpp: - `Rewiz-Tom-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `Rewiz-Tom-7B-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `Rewiz-Tom-7B-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `Rewiz-Tom-7B-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `Rewiz-Tom-7B-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `Rewiz-Tom-7B-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) - Base model: theprint/Tom-Qwen-7B-Instruct - Training dataset: theprint/ReWiz - Fine-tuning framework: Unsloth - Quantization: llama.cpp

NaNK

license:apache-2.0

Hemispheres-v0.2-Gemma2-9B-Combo

NaNK

license:apache-2.0

Math-Coma-7B

The theprint/MathTutor-7B model further finetuned on natural reasoning using GRPO. This is an experimental model and likely to hallucinate. - Developed by: theprint - License: apache-2.0 - Finetuned from model : theprint/MathTutor-7B This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

license:apache-2.0

Hemispheres-7B-Right

NaNK

license:apache-2.0

RRT1-3B

NaNK

license:apache-2.0

ThinkMix-Gemma3-4B-GRPO-GGUF

NaNK

license:apache-2.0

Llama3.1-8B-CodeThink-GGUF

NaNK

llama

DevilsAdvocate-7B-GGUF

Quantized GGUF versions of DevilsAdvocate-7B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuned model: theprint/DevilsAdvocate-7B - Quantized by: theprint - `DevilsAdvocate-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-7B-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-7B-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-7B-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-7B-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-7B-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

CleverBoi-Phi-3.5-mini-instruct

llama

Mistral-7b-Instruct-v0.2-python-18k

NaNK

license:apache-2.0

Genuine-7B-Instruct-GGUF

Quantized GGUF versions of Genuine-7B-Instruct for use with llama.cpp and other GGUF-compatible inference engines. - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuned model: theprint/Genuine-7B-Instruct - Quantized by: theprint - `Genuine-7B-Instruct-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `Genuine-7B-Instruct-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `Genuine-7B-Instruct-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `Genuine-7B-Instruct-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `Genuine-7B-Instruct-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `Genuine-7B-Instruct-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

Rewiz-Gemma3-1B-GGUF

NaNK

llama.cpp

Genuine-Zeth-4B-GGUF

NaNK

llama.cpp

Genuine-1B-GGUF

Quantized GGUF versions of Genuine-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/Genuine-1B - Quantized by: theprint - `Genuine-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `Genuine-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `Genuine-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `Genuine-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `Genuine-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `Genuine-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

Qwen2.5-14B-GameDev-GGUF

- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-14b-unsloth-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

license:apache-2.0

ReasonableMath-Llama-3.2-3B-Instruct-GGUF

NaNK

llama.cpp

Hemispheres-Llama3.1-8B-Combo

NaNK

llama

RuDolph-Hermes-7B-Q6_K-GGUF

NaNK

llama-cpp

DevilsAdvocate-8B-GGUF

Quantized GGUF versions of DevilsAdvocate-8B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: Qwen/Qwen3-8B - Fine-tuned model: theprint/DevilsAdvocate-8B - Quantized by: theprint - `DevilsAdvocate-8B-f16.gguf` (15628.9 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-8B-q3km.gguf` (3933.1 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-8B-q4km.gguf` (4794.9 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-8B-q5km.gguf` (5580.1 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-8B-q6k.gguf` (6414.3 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-8B-q80.gguf` (8306.0 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK

llama.cpp

PyRe-3B-v1-GGUF

NaNK

license:apache-2.0

PyRe-3B-v2-GGUF

Please note that this model is a WIP experiment into GRPO fine tuning on Python code problems for reasoning. The performance of this model varies greatly depending on task, prompt and parameters. I recommend a very low temperature, like 0.1. You may also see more consistent results by encouraging the use of ` ` and ` ` tags in the system prompt. - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/llama-3.2-3b-instruct-unsloth-bnb-4bit This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

llama

Coma-7B

NaNK

license:apache-2.0

TextSynth-Gemma3-12B

NaNK

license:apache-2.0

RuDolph-Hermes-7B-Q4_K_S-GGUF

NaNK

llama-cpp

Hemispheres-Qwen2.5-3B-Left

- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-3b-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

license:apache-2.0

Coma-3B

Coma is based on Qwen 2.5 3B, GRPO-fine tuned on the natural reasoning data set from Meta. There are quantized versions available at theprint/Coma-3B-GGUF in GGUF format. The following system prompt was used in testing of the model: Testing was done at `temperature=1.0`, `topk=45` and `topp=0.95`. - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

license:apache-2.0

ReWiz-Worldbuilder-7B

NaNK

—

Genuine-Phi4

This is a phi 4 (14B) model, fine tuned for more engaging conversation, to limit sycofancy in language models and encouraging the models to (gently) push back and call out bad ideas. Intended Use Brainstorming, idea development, general conversation - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/phi-4-unsloth-bnb-4bit This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

llama

Genuine-Phi4-Q4_K_M-GGUF

NaNK

llama

ThinkMix-Gemma3-4B

NaNK

license:apache-2.0

WorldBuilder-7B

- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/mistral-7b-v0.3-bnb-4bit This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK

license:apache-2.0

RuDolph-Hermes-7B

NaNK

—

ReWiz-Nemo-12B-Instruct

NaNK

license:apache-2.0

WorldBuilder-12B

NaNK

license:apache-2.0

PyRe-3B-v1

NaNK

license:apache-2.0

TextSynth-8B

NaNK

llama

Gemma3-CP1-12B

NaNK

license:apache-2.0

TiTan-Qwen2.5-0.5B

NaNK

license:apache-2.0

PyRe-Llama8.1-8B

NaNK

llama

Llama3.1-8B-CodeThink-16bit

NaNK

llama

Zeth-Gemma3-4B

A fine-tuned Gemma3 4B model, specialized in pragmatic empathy, or perhaps it is empathic pragmatism? This model is a fine-tuned version of google/gemma-3-4b-it using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: apache-2.0 - Base model: google/gemma-3-4b-it - Fine-tuning method: LoRA with rank 128 Conversation, brainstorming, and general instruction following. Quantized GGUF versions are available at theprint/Zeth-Gemma3-4B-GGUF: - `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file) - `Zeth-Gemma3-4B-q3km.gguf` (2276.3 MB) - 3-bit quantization (medium quality) - `Zeth-Gemma3-4B-q4km.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases) - `Zeth-Gemma3-4B-q5km.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality) - `Zeth-Gemma3-4B-q6k.gguf` (3568.1 MB) - 6-bit quantization (high quality) - `Zeth-Gemma3-4B-q80.gguf` (4619.2 MB) - 8-bit quantization (very high quality) The Zeth data set was specifically created for finetuning models on empathic explanation. This was done by taking premade data sets and rewording the replies to be in line with the style for Zeth. - Training epochs: 3 - LoRA rank: 128 - Learning rate: 0.0002 - Batch size: 4 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 - Base model: google/gemma-3-4b-it - Training dataset: theprint/Zeth - Fine-tuning framework: Unsloth - Quantization: llama.cpp

NaNK

license:apache-2.0

Genuine-7B-Instruct

A fine-tuned Qwen 2.5 7B Instruct model, tuned for more engaging conversation with fewer sycofant responses. This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: apache-2.0 - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuning method: LoRA with rank 128 Brainstorming, idea development, general conversation Quantized GGUF versions are available in the theprint/Genuine-7B-Instruct-GGUF repo. - `Genuine-7B-Instruct-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `Genuine-7B-Instruct-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `Genuine-7B-Instruct-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `Genuine-7B-Instruct-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `Genuine-7B-Instruct-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `Genuine-7B-Instruct-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) This data set was created to limit sycofancy in language models and encouraging the models to (gently) push back and call out bad ideas. - Dataset: theprint/Gentle-Pushback-8.5k-alpaca - Format: alpaca - Training epochs: 2 - LoRA rank: 128 - Learning rate: 0.0001 - Batch size: 6 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 - Base model: Qwen/Qwen2.5-7B-Instruct - Training dataset: theprint/Gentle-Pushback-8.5k-alpaca - Fine-tuning framework: Unsloth - Quantization: llama.cpp

NaNK

license:apache-2.0

DevilsAdvocate-7B

A fine-tuned Qwen 2.5 7B model, fine tuned for more engaging conversation, encouraging the user to think about different aspects. This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: mit - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuning method: LoRA with rank 128 Brainstorming, idea development, general conversation Quantized GGUF versions are available in the theprint/DevilsAdvocate-7B-GGUF repo. - `DevilsAdvocate-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-7B-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-7B-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-7B-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-7B-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-7B-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) This data set was created to limit sycofancy and encourage gentle pushback in language models. - Training epochs: 2 - LoRA rank: 128 - Learning rate: 0.0001 - Batch size: 1 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 - Base model: Qwen/Qwen2.5-7B-Instruct - Training dataset: theprint/Advocate-9.4k - Fine-tuning framework: Unsloth - Quantization: llama.cpp

NaNK

license:mit

theprint

ReWiz-Nemo-12B-Instruct-GGUF

GeneralChat-Llama3.2-3B-DPO-GGUF

theprint-moe-8x3-0126-GGUF

Llama-3-8B-Lexi-Smaug-Uncensored

ReWiz-Llama-3.2-3B

MathTutor-7B-GGUF

Genuine-Gemma3-12B-GGUF

CreativeWriter-Llama3.2-3B-GGUF

PositiveDetox-Qwen2.5-14B-GGUF

ReWiz-Llama-3.1-8B-v2

ReWiz-Qwen-2.5-14B

TextSynth-8B-GGUF

Code-Llama-Bagel-8B

CleverBoi-Llama-3.1-8B-Instruct

PositiveDetox-Qwen3-4B-GGUF

phi-3-mini-4k-python

CleverBoi-7B-v3

Coma-7B-GGUF

ReWiz-7B

Coma-3B-GGUF

CleverBoi-Nemo-12B-v2

ReWiz-Qwen2.5-7B

Conversely-Mistral-7B

ReWiz-Llama-3.1-8B

CleverBoi-Gemma-2-9B-v2

Boptruth-NeuralMonarch-7B

CleverBoi-Llama-3.1-8B-v2

CleverBoi-Llama-3.2-3B-Instruct

TextSynth-Phi4-14B-GGUF

CleverBoi-Llama-3.1-8B-Python

ThinkMix-Qwen2.5-1.5B-GGUF

MathTutor-7B

Llama-3.2-3B-VanRossum

Empathetic-Llama-3.2-3B-Instruct-GGUF

ReWiz-Gemma2-9B

Hemispheres-7B-Final

CleverBoi-7B-v2

mistral-7b-cthulhu

Hemispheres-Qwen2.5-3B-Right

Llama-CP1-8B-GGUF

CleverBoi-Nemo-12B

Hemispheres-Llama3.2-3B-Right

tinyllama_alpaca_cthulhu_small

CleverBoi-Gemma-2-9B

VanRossum-Qwen2.5-Coder-3B

CleverBoi-Mistral-0.3-7B

Genuine-Gemma3-12B

ReWiz-Llama-3.2-1B

Boptruth-Agatha-7B

GeneralChat-Llama3.2-3B-DPO

PositiveDetox-Qwen2.5-14B

PyRe-Llama8.1-8B-GGUF

ReWiz-Worldbuilder-7B-GGUF

Hemispheres-Llama3.2-3B-Combiner

Gemma3-Python-22k-1B-GGUF

DevilsAdvocate-1B-GGUF

MLF-Llama3.2-3B

TiTan-Llama-3.2-1B-GGUF

Hemispheres-Llama3.2-3B-Left

Tom-Qwen-7B-Instruct

phi-3-mini-4k-gamedev

Hemispheres-7B-Combiner

Hemispheres-Qwen2.5-1.5B-Combo

CodeThink-8B-GRPO-GGUF

ReWiz-Phi-4-14B-GGUF

Pythonified-Llama-3.2-3B-Instruct-GGUF

TiTan-Qwen2.5-0.5B-GGUF

WorldBuilder-12B-GGUF

Hemispheres-7B-Left

Hemispheres-v0.2-Llama3.2-3B-Final

TiTan-Gemma3-0.27B-GGUF

TextSynth-3B-GGUF

Conversely-Llama3.1-8B

TiTan-Gemma3-4B-GGUF

Nerdish-Llama-3.1-8B

Hemispheres-Llama3.2-3B-Final

Hemispheres-v0.2-Llama3.2-3B-Combo

Zeth-Gemma3-4B-GGUF

CreativeWriter-Llama3.2-3B