theprint

126 models • 18 total models in database
Sort by:

ReWiz-Nemo-12B-Instruct-GGUF

- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:apache-2.0
1,757
0

GeneralChat-Llama3.2-3B-DPO-GGUF

NaNK
base_model:theprint/GeneralChat-Llama3.2-3B-DPO
1,664
0

theprint-moe-8x3-0126-GGUF

NaNK
llama
591
0

Llama-3-8B-Lexi-Smaug-Uncensored

NaNK
Orenguteng/Llama-3-8B-Lexi-Uncensored
459
4

ReWiz-Llama-3.2-3B

NaNK
llama
455
3

MathTutor-7B-GGUF

Fine-tuned on theprint/CoT-Explaining-Math for using `think` and `answer` tags for reasoning.

NaNK
license:mit
352
0

Genuine-Gemma3-12B-GGUF

- Developed by: theprint - Finetuned from model : unsloth/gemma-3-12b-it-unsloth-bnb-4bit This gemma3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:mit
299
0

CreativeWriter-Llama3.2-3B-GGUF

NaNK
base_model:theprint/CreativeWriter-Llama3.2-3B
290
0

PositiveDetox-Qwen2.5-14B-GGUF

This model was finetuned on the Positive Detox data set, designed specifically to reduce or eliminate toxic positivity in AI responses.

NaNK
license:mit
279
0

ReWiz-Llama-3.1-8B-v2

NaNK
llama
253
1

ReWiz-Qwen-2.5-14B

NaNK
license:apache-2.0
235
5

TextSynth-8B-GGUF

NaNK
llama
232
1

Code-Llama-Bagel-8B

NaNK
llama
218
1

CleverBoi-Llama-3.1-8B-Instruct

NaNK
llama
206
1

PositiveDetox-Qwen3-4B-GGUF

NaNK
license:mit
206
0

phi-3-mini-4k-python

NaNK
license:apache-2.0
180
1

CleverBoi-7B-v3

NaNK
license:apache-2.0
180
0

Coma-7B-GGUF

NaNK
license:apache-2.0
178
0

ReWiz-7B

NaNK
license:apache-2.0
164
0

Coma-3B-GGUF

NaNK
license:apache-2.0
160
0

CleverBoi-Nemo-12B-v2

NaNK
license:apache-2.0
136
4

ReWiz-Qwen2.5-7B

- Before both User and Assistant: `\n\n` - Before User: `### Instruction:\n` - Before Assistant: `### Response:\n` - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-7b-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:apache-2.0
129
0

Conversely-Mistral-7B

NaNK
license:apache-2.0
125
0

ReWiz-Llama-3.1-8B

NaNK
llama
121
1

CleverBoi-Gemma-2-9B-v2

NaNK
license:apache-2.0
120
0

Boptruth-NeuralMonarch-7B

NaNK
license:apache-2.0
117
2

CleverBoi-Llama-3.1-8B-v2

NaNK
llama
116
0

CleverBoi-Llama-3.2-3B-Instruct

NaNK
llama
113
0

TextSynth-Phi4-14B-GGUF

NaNK
llama
110
0

CleverBoi-Llama-3.1-8B-Python

NaNK
llama
109
0

ThinkMix-Qwen2.5-1.5B-GGUF

NaNK
llama.cpp
107
0

MathTutor-7B

Fine-tuned on theprint/CoT-Explaining-Math for using `think` and `answer` tags for reasoning. Find various GGIF-quants of this model at theprint/MathTutor-7B-GGUF - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-7b-instruct-unsloth-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:apache-2.0
102
1

Llama-3.2-3B-VanRossum

NaNK
llama
94
0

Empathetic-Llama-3.2-3B-Instruct-GGUF

Quantized GGUF versions of Empathetic-Llama-3.2-3B-Instruct for use with llama.cpp and other GGUF-compatible inference engines. - Base model: meta-llama/Llama-3.2-3B-Instruct - Fine-tuned model: theprint/Empathetic-Llama-3.2-3B-Instruct - Quantized by: theprint - `Empathetic-Llama-3.2-3B-Instruct-f16.gguf` (6135.6 MB) - 16-bit float (original precision, largest file) - `Empathetic-Llama-3.2-3B-Instruct-q3km.gguf` (1609.0 MB) - 3-bit quantization (medium quality) - `Empathetic-Llama-3.2-3B-Instruct-q4km.gguf` (1925.8 MB) - 4-bit quantization (medium, recommended for most use cases) - `Empathetic-Llama-3.2-3B-Instruct-q5km.gguf` (2214.6 MB) - 5-bit quantization (medium, good quality) - `Empathetic-Llama-3.2-3B-Instruct-q6k.gguf` (2521.4 MB) - 6-bit quantization (high quality) - `Empathetic-Llama-3.2-3B-Instruct-q80.gguf` (3263.4 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
91
0

ReWiz-Gemma2-9B

NaNK
license:apache-2.0
87
1

Hemispheres-7B-Final

NaNK
license:apache-2.0
86
0

CleverBoi-7B-v2

NaNK
license:apache-2.0
83
0

mistral-7b-cthulhu

NaNK
license:apache-2.0
80
2

Hemispheres-Qwen2.5-3B-Right

NaNK
license:apache-2.0
78
0

Llama-CP1-8B-GGUF

NaNK
llama
75
0

CleverBoi-Nemo-12B

NaNK
license:apache-2.0
73
0

Hemispheres-Llama3.2-3B-Right

NaNK
llama
71
0

tinyllama_alpaca_cthulhu_small

NaNK
llama
69
1

CleverBoi-Gemma-2-9B

NaNK
license:apache-2.0
69
1

VanRossum-Qwen2.5-Coder-3B

This model has been trained for 1 epoch on the VanRossum dataset. The VanRossum dataset is all Python! I used DataMix to combine a handful of highly rated Python-centric datasets, to get a sampling of each and create something new. This data set has 80,000 entries and is named after Guido Van Rossum, the man who invented Python back in 1991. See the VanRossum Collection on HF for all things related to this dataset. There are 2 versions of this dataset available on Huggingface. - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/Qwen2.5-Coder-3B-Instruct-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:apache-2.0
69
0

CleverBoi-Mistral-0.3-7B

NaNK
license:apache-2.0
64
1

Genuine-Gemma3-12B

NaNK
license:apache-2.0
62
1

ReWiz-Llama-3.2-1B

Half the data was geared towards better reasoning (EvolKit-20k and reasoning-base-20k), the other half will help to de-censor the model (WizardLM data set). - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/llama-3.2-1b-instruct-bnb-4bit This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
llama
61
0

Boptruth-Agatha-7B

NaNK
60
0

GeneralChat-Llama3.2-3B-DPO

NaNK
llama
59
0

PositiveDetox-Qwen2.5-14B

NaNK
license:apache-2.0
58
1

PyRe-Llama8.1-8B-GGUF

NaNK
llama
55
1

ReWiz-Worldbuilder-7B-GGUF

NaNK
license:apache-2.0
55
0

Hemispheres-Llama3.2-3B-Combiner

NaNK
llama
54
0

Gemma3-Python-22k-1B-GGUF

Quantized GGUF versions of Gemma3-Python-22k-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/Gemma3-Python-22k-1B - Quantized by: theprint - `Gemma3-Python-22k-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `Gemma3-Python-22k-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `Gemma3-Python-22k-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `Gemma3-Python-22k-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `Gemma3-Python-22k-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `Gemma3-Python-22k-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
54
0

DevilsAdvocate-1B-GGUF

Quantized GGUF versions of DevilsAdvocate-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/DevilsAdvocate-1B - Quantized by: theprint - `DevilsAdvocate-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
54
0

MLF-Llama3.2-3B

NaNK
llama
52
0

TiTan-Llama-3.2-1B-GGUF

Quantized GGUF versions of TiTan-Llama-3.2-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: unsloth/Llama-3.2-1B - Fine-tuned model: theprint/TiTan-Llama-3.2-1B - Quantized by: theprint - `TiTan-Llama-3.2-1B-f16.gguf` (2364.7 MB) - 16-bit float (original precision, largest file) - `TiTan-Llama-3.2-1B-q3km.gguf` (658.8 MB) - 3-bit quantization (medium quality) - `TiTan-Llama-3.2-1B-q4km.gguf` (770.3 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Llama-3.2-1B-q5km.gguf` (869.3 MB) - 5-bit quantization (medium, good quality) - `TiTan-Llama-3.2-1B-q6k.gguf` (974.5 MB) - 6-bit quantization (high quality) - `TiTan-Llama-3.2-1B-q80.gguf` (1259.9 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
52
0

Hemispheres-Llama3.2-3B-Left

NaNK
llama
51
0

Tom-Qwen-7B-Instruct

NaNK
license:apache-2.0
50
1

phi-3-mini-4k-gamedev

license:apache-2.0
50
0

Hemispheres-7B-Combiner

NaNK
license:apache-2.0
50
0

Hemispheres-Qwen2.5-1.5B-Combo

NaNK
license:apache-2.0
50
0

CodeThink-8B-GRPO-GGUF

NaNK
llama
50
0

ReWiz-Phi-4-14B-GGUF

NaNK
llama
49
0

Pythonified-Llama-3.2-3B-Instruct-GGUF

NaNK
llama.cpp
49
0

TiTan-Qwen2.5-0.5B-GGUF

NaNK
llama.cpp
48
3

WorldBuilder-12B-GGUF

NaNK
license:apache-2.0
47
0

Hemispheres-7B-Left

NaNK
license:apache-2.0
47
0

Hemispheres-v0.2-Llama3.2-3B-Final

NaNK
llama
47
0

TiTan-Gemma3-0.27B-GGUF

Quantized GGUF versions of TiTan-Gemma3-0.27B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-270m-it - Fine-tuned model: theprint/TiTan-Gemma3-0.27B - Quantized by: theprint - `TiTan-Gemma3-0.27B-f16.gguf` (837.7 MB) - 16-bit float (original precision, largest file) - `TiTan-Gemma3-0.27B-q3km.gguf` (320.8 MB) - 3-bit quantization (medium quality) - `TiTan-Gemma3-0.27B-q4km.gguf` (351.4 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Gemma3-0.27B-q5km.gguf` (368.0 MB) - 5-bit quantization (medium, good quality) - `TiTan-Gemma3-0.27B-q6k.gguf` (439.9 MB) - 6-bit quantization (high quality) - `TiTan-Gemma3-0.27B-q80.gguf` (448.0 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
46
1

TextSynth-3B-GGUF

NaNK
llama
45
0

Conversely-Llama3.1-8B

NaNK
llama
44
1

TiTan-Gemma3-4B-GGUF

Quantized GGUF versions of TiTan-Gemma3-4B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-4b-it - Fine-tuned model: theprint/TiTan-Gemma3-4B - Quantized by: theprint - `TiTan-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file) - `TiTan-Gemma3-4B-q3km.gguf` (2276.3 MB) - 3-bit quantization (medium quality) - `TiTan-Gemma3-4B-q4km.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Gemma3-4B-q5km.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality) - `TiTan-Gemma3-4B-q6k.gguf` (3568.1 MB) - 6-bit quantization (high quality) - `TiTan-Gemma3-4B-q80.gguf` (4619.2 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
43
0

Nerdish-Llama-3.1-8B

NaNK
llama
42
0

Hemispheres-Llama3.2-3B-Final

NaNK
llama
40
0

Hemispheres-v0.2-Llama3.2-3B-Combo

NaNK
llama
40
0

Zeth-Gemma3-4B-GGUF

Quantized GGUF versions of Zeth-Gemma3-4B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-4b-it - Fine-tuned model: theprint/Zeth-Gemma3-4B - Quantized by: theprint - `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file) - `Zeth-Gemma3-4B-q3km.gguf` (2276.3 MB) - 3-bit quantization (medium quality) - `Zeth-Gemma3-4B-q4km.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases) - `Zeth-Gemma3-4B-q5km.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality) - `Zeth-Gemma3-4B-q6k.gguf` (3568.1 MB) - 6-bit quantization (high quality) - `Zeth-Gemma3-4B-q80.gguf` (4619.2 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
40
0

CreativeWriter-Llama3.2-3B

NaNK
llama
37
0

TiTan-Gemma3-1B-GGUF

Quantized GGUF versions of TiTan-Gemma3-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/TiTan-Gemma3-1B - Quantized by: theprint - `TiTan-Gemma3-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `TiTan-Gemma3-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `TiTan-Gemma3-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Gemma3-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `TiTan-Gemma3-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `TiTan-Gemma3-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
35
0

tinyllama-Evol-Instruct

NaNK
llama
34
0

PositiveDetox-Qwen3-4B

NaNK
license:apache-2.0
31
1

WorldBuilder-7B-GGUF

NaNK
license:apache-2.0
30
0

Rewiz-Tom-7B

A fine-tuned 7B parameter model specialized in reasoning (Rewiz), based on a model that was already finetuned for step-by-step instruction and conversation (Tom). This model is a fine-tuned version of theprint/Tom-Qwen-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: apache-2.0 - Base model: theprint/Tom-Qwen-7B-Instruct - Fine-tuning method: LoRA with rank 128 Conversation, brainstorming, and general instruction following The Rewiz data set is a curated mix of 20,000 reasoning-based entries. - Training epochs: 2 - LoRA rank: 128 - Learning rate: 0.0002 - Batch size: 4 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 Quantized GGUF versions are available in the `gguf/` directory for use with llama.cpp: - `Rewiz-Tom-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `Rewiz-Tom-7B-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `Rewiz-Tom-7B-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `Rewiz-Tom-7B-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `Rewiz-Tom-7B-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `Rewiz-Tom-7B-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) - Base model: theprint/Tom-Qwen-7B-Instruct - Training dataset: theprint/ReWiz - Fine-tuning framework: Unsloth - Quantization: llama.cpp

NaNK
license:apache-2.0
28
1

Hemispheres-v0.2-Gemma2-9B-Combo

NaNK
license:apache-2.0
28
0

Math-Coma-7B

The theprint/MathTutor-7B model further finetuned on natural reasoning using GRPO. This is an experimental model and likely to hallucinate. - Developed by: theprint - License: apache-2.0 - Finetuned from model : theprint/MathTutor-7B This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:apache-2.0
27
1

Hemispheres-7B-Right

NaNK
license:apache-2.0
27
0

RRT1-3B

NaNK
license:apache-2.0
26
0

ThinkMix-Gemma3-4B-GRPO-GGUF

NaNK
license:apache-2.0
25
0

Llama3.1-8B-CodeThink-GGUF

NaNK
llama
24
0

DevilsAdvocate-7B-GGUF

Quantized GGUF versions of DevilsAdvocate-7B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuned model: theprint/DevilsAdvocate-7B - Quantized by: theprint - `DevilsAdvocate-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-7B-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-7B-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-7B-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-7B-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-7B-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
23
0

CleverBoi-Phi-3.5-mini-instruct

llama
22
0

Mistral-7b-Instruct-v0.2-python-18k

NaNK
license:apache-2.0
21
0

Genuine-7B-Instruct-GGUF

Quantized GGUF versions of Genuine-7B-Instruct for use with llama.cpp and other GGUF-compatible inference engines. - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuned model: theprint/Genuine-7B-Instruct - Quantized by: theprint - `Genuine-7B-Instruct-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `Genuine-7B-Instruct-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `Genuine-7B-Instruct-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `Genuine-7B-Instruct-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `Genuine-7B-Instruct-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `Genuine-7B-Instruct-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
21
0

Rewiz-Gemma3-1B-GGUF

NaNK
llama.cpp
20
0

Genuine-Zeth-4B-GGUF

NaNK
llama.cpp
19
0

Genuine-1B-GGUF

Quantized GGUF versions of Genuine-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/Genuine-1B - Quantized by: theprint - `Genuine-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `Genuine-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `Genuine-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `Genuine-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `Genuine-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `Genuine-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
18
0

Qwen2.5-14B-GameDev-GGUF

- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-14b-unsloth-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:apache-2.0
17
0

ReasonableMath-Llama-3.2-3B-Instruct-GGUF

NaNK
llama.cpp
17
0

Hemispheres-Llama3.1-8B-Combo

NaNK
llama
16
0

RuDolph-Hermes-7B-Q6_K-GGUF

NaNK
llama-cpp
14
0

DevilsAdvocate-8B-GGUF

Quantized GGUF versions of DevilsAdvocate-8B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: Qwen/Qwen3-8B - Fine-tuned model: theprint/DevilsAdvocate-8B - Quantized by: theprint - `DevilsAdvocate-8B-f16.gguf` (15628.9 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-8B-q3km.gguf` (3933.1 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-8B-q4km.gguf` (4794.9 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-8B-q5km.gguf` (5580.1 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-8B-q6k.gguf` (6414.3 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-8B-q80.gguf` (8306.0 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`

NaNK
llama.cpp
13
0

PyRe-3B-v1-GGUF

NaNK
license:apache-2.0
11
0

PyRe-3B-v2-GGUF

Please note that this model is a WIP experiment into GRPO fine tuning on Python code problems for reasoning. The performance of this model varies greatly depending on task, prompt and parameters. I recommend a very low temperature, like 0.1. You may also see more consistent results by encouraging the use of ` ` and ` ` tags in the system prompt. - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/llama-3.2-3b-instruct-unsloth-bnb-4bit This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
llama
10
0

Coma-7B

NaNK
license:apache-2.0
10
0

TextSynth-Gemma3-12B

NaNK
license:apache-2.0
7
0

RuDolph-Hermes-7B-Q4_K_S-GGUF

NaNK
llama-cpp
6
0

Hemispheres-Qwen2.5-3B-Left

- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-3b-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:apache-2.0
6
0

Coma-3B

Coma is based on Qwen 2.5 3B, GRPO-fine tuned on the natural reasoning data set from Meta. There are quantized versions available at theprint/Coma-3B-GGUF in GGUF format. The following system prompt was used in testing of the model: Testing was done at `temperature=1.0`, `topk=45` and `topp=0.95`. - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:apache-2.0
5
0

ReWiz-Worldbuilder-7B

NaNK
3
0

Genuine-Phi4

This is a phi 4 (14B) model, fine tuned for more engaging conversation, to limit sycofancy in language models and encouraging the models to (gently) push back and call out bad ideas. Intended Use Brainstorming, idea development, general conversation - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/phi-4-unsloth-bnb-4bit This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
llama
3
0

Genuine-Phi4-Q4_K_M-GGUF

NaNK
llama
3
0

ThinkMix-Gemma3-4B

NaNK
license:apache-2.0
2
1

WorldBuilder-7B

- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/mistral-7b-v0.3-bnb-4bit This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:apache-2.0
2
1

RuDolph-Hermes-7B

NaNK
2
0

ReWiz-Nemo-12B-Instruct

NaNK
license:apache-2.0
1
2

WorldBuilder-12B

NaNK
license:apache-2.0
1
0

PyRe-3B-v1

NaNK
license:apache-2.0
1
0

TextSynth-8B

NaNK
llama
1
0

Gemma3-CP1-12B

NaNK
license:apache-2.0
1
0

TiTan-Qwen2.5-0.5B

NaNK
license:apache-2.0
0
4

PyRe-Llama8.1-8B

NaNK
llama
0
1

Llama3.1-8B-CodeThink-16bit

NaNK
llama
0
1

Zeth-Gemma3-4B

A fine-tuned Gemma3 4B model, specialized in pragmatic empathy, or perhaps it is empathic pragmatism? This model is a fine-tuned version of google/gemma-3-4b-it using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: apache-2.0 - Base model: google/gemma-3-4b-it - Fine-tuning method: LoRA with rank 128 Conversation, brainstorming, and general instruction following. Quantized GGUF versions are available at theprint/Zeth-Gemma3-4B-GGUF: - `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file) - `Zeth-Gemma3-4B-q3km.gguf` (2276.3 MB) - 3-bit quantization (medium quality) - `Zeth-Gemma3-4B-q4km.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases) - `Zeth-Gemma3-4B-q5km.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality) - `Zeth-Gemma3-4B-q6k.gguf` (3568.1 MB) - 6-bit quantization (high quality) - `Zeth-Gemma3-4B-q80.gguf` (4619.2 MB) - 8-bit quantization (very high quality) The Zeth data set was specifically created for finetuning models on empathic explanation. This was done by taking premade data sets and rewording the replies to be in line with the style for Zeth. - Training epochs: 3 - LoRA rank: 128 - Learning rate: 0.0002 - Batch size: 4 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 - Base model: google/gemma-3-4b-it - Training dataset: theprint/Zeth - Fine-tuning framework: Unsloth - Quantization: llama.cpp

NaNK
license:apache-2.0
0
1

Genuine-7B-Instruct

A fine-tuned Qwen 2.5 7B Instruct model, tuned for more engaging conversation with fewer sycofant responses. This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: apache-2.0 - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuning method: LoRA with rank 128 Brainstorming, idea development, general conversation Quantized GGUF versions are available in the theprint/Genuine-7B-Instruct-GGUF repo. - `Genuine-7B-Instruct-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `Genuine-7B-Instruct-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `Genuine-7B-Instruct-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `Genuine-7B-Instruct-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `Genuine-7B-Instruct-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `Genuine-7B-Instruct-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) This data set was created to limit sycofancy in language models and encouraging the models to (gently) push back and call out bad ideas. - Dataset: theprint/Gentle-Pushback-8.5k-alpaca - Format: alpaca - Training epochs: 2 - LoRA rank: 128 - Learning rate: 0.0001 - Batch size: 6 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 - Base model: Qwen/Qwen2.5-7B-Instruct - Training dataset: theprint/Gentle-Pushback-8.5k-alpaca - Fine-tuning framework: Unsloth - Quantization: llama.cpp

NaNK
license:apache-2.0
0
1

DevilsAdvocate-7B

A fine-tuned Qwen 2.5 7B model, fine tuned for more engaging conversation, encouraging the user to think about different aspects. This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: mit - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuning method: LoRA with rank 128 Brainstorming, idea development, general conversation Quantized GGUF versions are available in the theprint/DevilsAdvocate-7B-GGUF repo. - `DevilsAdvocate-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-7B-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-7B-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-7B-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-7B-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-7B-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) This data set was created to limit sycofancy and encourage gentle pushback in language models. - Training epochs: 2 - LoRA rank: 128 - Learning rate: 0.0001 - Batch size: 1 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 - Base model: Qwen/Qwen2.5-7B-Instruct - Training dataset: theprint/Advocate-9.4k - Fine-tuning framework: Unsloth - Quantization: llama.cpp

NaNK
license:mit
0
1