theprint
ReWiz-Nemo-12B-Instruct-GGUF
- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.
GeneralChat-Llama3.2-3B-DPO-GGUF
theprint-moe-8x3-0126-GGUF
Llama-3-8B-Lexi-Smaug-Uncensored
ReWiz-Llama-3.2-3B
MathTutor-7B-GGUF
Fine-tuned on theprint/CoT-Explaining-Math for using `think` and `answer` tags for reasoning.
Genuine-Gemma3-12B-GGUF
- Developed by: theprint - Finetuned from model : unsloth/gemma-3-12b-it-unsloth-bnb-4bit This gemma3 model was trained 2x faster with Unsloth and Huggingface's TRL library.
CreativeWriter-Llama3.2-3B-GGUF
PositiveDetox-Qwen2.5-14B-GGUF
This model was finetuned on the Positive Detox data set, designed specifically to reduce or eliminate toxic positivity in AI responses.
ReWiz-Llama-3.1-8B-v2
ReWiz-Qwen-2.5-14B
TextSynth-8B-GGUF
Code-Llama-Bagel-8B
CleverBoi-Llama-3.1-8B-Instruct
PositiveDetox-Qwen3-4B-GGUF
phi-3-mini-4k-python
CleverBoi-7B-v3
Coma-7B-GGUF
ReWiz-7B
Coma-3B-GGUF
CleverBoi-Nemo-12B-v2
ReWiz-Qwen2.5-7B
- Before both User and Assistant: `\n\n` - Before User: `### Instruction:\n` - Before Assistant: `### Response:\n` - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-7b-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
Conversely-Mistral-7B
ReWiz-Llama-3.1-8B
CleverBoi-Gemma-2-9B-v2
Boptruth-NeuralMonarch-7B
CleverBoi-Llama-3.1-8B-v2
CleverBoi-Llama-3.2-3B-Instruct
TextSynth-Phi4-14B-GGUF
CleverBoi-Llama-3.1-8B-Python
ThinkMix-Qwen2.5-1.5B-GGUF
MathTutor-7B
Fine-tuned on theprint/CoT-Explaining-Math for using `think` and `answer` tags for reasoning. Find various GGIF-quants of this model at theprint/MathTutor-7B-GGUF - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-7b-instruct-unsloth-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
Llama-3.2-3B-VanRossum
Empathetic-Llama-3.2-3B-Instruct-GGUF
Quantized GGUF versions of Empathetic-Llama-3.2-3B-Instruct for use with llama.cpp and other GGUF-compatible inference engines. - Base model: meta-llama/Llama-3.2-3B-Instruct - Fine-tuned model: theprint/Empathetic-Llama-3.2-3B-Instruct - Quantized by: theprint - `Empathetic-Llama-3.2-3B-Instruct-f16.gguf` (6135.6 MB) - 16-bit float (original precision, largest file) - `Empathetic-Llama-3.2-3B-Instruct-q3km.gguf` (1609.0 MB) - 3-bit quantization (medium quality) - `Empathetic-Llama-3.2-3B-Instruct-q4km.gguf` (1925.8 MB) - 4-bit quantization (medium, recommended for most use cases) - `Empathetic-Llama-3.2-3B-Instruct-q5km.gguf` (2214.6 MB) - 5-bit quantization (medium, good quality) - `Empathetic-Llama-3.2-3B-Instruct-q6k.gguf` (2521.4 MB) - 6-bit quantization (high quality) - `Empathetic-Llama-3.2-3B-Instruct-q80.gguf` (3263.4 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
ReWiz-Gemma2-9B
Hemispheres-7B-Final
CleverBoi-7B-v2
mistral-7b-cthulhu
Hemispheres-Qwen2.5-3B-Right
Llama-CP1-8B-GGUF
CleverBoi-Nemo-12B
Hemispheres-Llama3.2-3B-Right
tinyllama_alpaca_cthulhu_small
CleverBoi-Gemma-2-9B
VanRossum-Qwen2.5-Coder-3B
This model has been trained for 1 epoch on the VanRossum dataset. The VanRossum dataset is all Python! I used DataMix to combine a handful of highly rated Python-centric datasets, to get a sampling of each and create something new. This data set has 80,000 entries and is named after Guido Van Rossum, the man who invented Python back in 1991. See the VanRossum Collection on HF for all things related to this dataset. There are 2 versions of this dataset available on Huggingface. - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/Qwen2.5-Coder-3B-Instruct-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
CleverBoi-Mistral-0.3-7B
Genuine-Gemma3-12B
ReWiz-Llama-3.2-1B
Half the data was geared towards better reasoning (EvolKit-20k and reasoning-base-20k), the other half will help to de-censor the model (WizardLM data set). - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/llama-3.2-1b-instruct-bnb-4bit This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
Boptruth-Agatha-7B
GeneralChat-Llama3.2-3B-DPO
PositiveDetox-Qwen2.5-14B
PyRe-Llama8.1-8B-GGUF
ReWiz-Worldbuilder-7B-GGUF
Hemispheres-Llama3.2-3B-Combiner
Gemma3-Python-22k-1B-GGUF
Quantized GGUF versions of Gemma3-Python-22k-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/Gemma3-Python-22k-1B - Quantized by: theprint - `Gemma3-Python-22k-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `Gemma3-Python-22k-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `Gemma3-Python-22k-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `Gemma3-Python-22k-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `Gemma3-Python-22k-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `Gemma3-Python-22k-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
DevilsAdvocate-1B-GGUF
Quantized GGUF versions of DevilsAdvocate-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/DevilsAdvocate-1B - Quantized by: theprint - `DevilsAdvocate-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
MLF-Llama3.2-3B
TiTan-Llama-3.2-1B-GGUF
Quantized GGUF versions of TiTan-Llama-3.2-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: unsloth/Llama-3.2-1B - Fine-tuned model: theprint/TiTan-Llama-3.2-1B - Quantized by: theprint - `TiTan-Llama-3.2-1B-f16.gguf` (2364.7 MB) - 16-bit float (original precision, largest file) - `TiTan-Llama-3.2-1B-q3km.gguf` (658.8 MB) - 3-bit quantization (medium quality) - `TiTan-Llama-3.2-1B-q4km.gguf` (770.3 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Llama-3.2-1B-q5km.gguf` (869.3 MB) - 5-bit quantization (medium, good quality) - `TiTan-Llama-3.2-1B-q6k.gguf` (974.5 MB) - 6-bit quantization (high quality) - `TiTan-Llama-3.2-1B-q80.gguf` (1259.9 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
Hemispheres-Llama3.2-3B-Left
Tom-Qwen-7B-Instruct
phi-3-mini-4k-gamedev
Hemispheres-7B-Combiner
Hemispheres-Qwen2.5-1.5B-Combo
CodeThink-8B-GRPO-GGUF
ReWiz-Phi-4-14B-GGUF
Pythonified-Llama-3.2-3B-Instruct-GGUF
TiTan-Qwen2.5-0.5B-GGUF
WorldBuilder-12B-GGUF
Hemispheres-7B-Left
Hemispheres-v0.2-Llama3.2-3B-Final
TiTan-Gemma3-0.27B-GGUF
Quantized GGUF versions of TiTan-Gemma3-0.27B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-270m-it - Fine-tuned model: theprint/TiTan-Gemma3-0.27B - Quantized by: theprint - `TiTan-Gemma3-0.27B-f16.gguf` (837.7 MB) - 16-bit float (original precision, largest file) - `TiTan-Gemma3-0.27B-q3km.gguf` (320.8 MB) - 3-bit quantization (medium quality) - `TiTan-Gemma3-0.27B-q4km.gguf` (351.4 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Gemma3-0.27B-q5km.gguf` (368.0 MB) - 5-bit quantization (medium, good quality) - `TiTan-Gemma3-0.27B-q6k.gguf` (439.9 MB) - 6-bit quantization (high quality) - `TiTan-Gemma3-0.27B-q80.gguf` (448.0 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
TextSynth-3B-GGUF
Conversely-Llama3.1-8B
TiTan-Gemma3-4B-GGUF
Quantized GGUF versions of TiTan-Gemma3-4B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-4b-it - Fine-tuned model: theprint/TiTan-Gemma3-4B - Quantized by: theprint - `TiTan-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file) - `TiTan-Gemma3-4B-q3km.gguf` (2276.3 MB) - 3-bit quantization (medium quality) - `TiTan-Gemma3-4B-q4km.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Gemma3-4B-q5km.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality) - `TiTan-Gemma3-4B-q6k.gguf` (3568.1 MB) - 6-bit quantization (high quality) - `TiTan-Gemma3-4B-q80.gguf` (4619.2 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
Nerdish-Llama-3.1-8B
Hemispheres-Llama3.2-3B-Final
Hemispheres-v0.2-Llama3.2-3B-Combo
Zeth-Gemma3-4B-GGUF
Quantized GGUF versions of Zeth-Gemma3-4B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-4b-it - Fine-tuned model: theprint/Zeth-Gemma3-4B - Quantized by: theprint - `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file) - `Zeth-Gemma3-4B-q3km.gguf` (2276.3 MB) - 3-bit quantization (medium quality) - `Zeth-Gemma3-4B-q4km.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases) - `Zeth-Gemma3-4B-q5km.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality) - `Zeth-Gemma3-4B-q6k.gguf` (3568.1 MB) - 6-bit quantization (high quality) - `Zeth-Gemma3-4B-q80.gguf` (4619.2 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
CreativeWriter-Llama3.2-3B
TiTan-Gemma3-1B-GGUF
Quantized GGUF versions of TiTan-Gemma3-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/TiTan-Gemma3-1B - Quantized by: theprint - `TiTan-Gemma3-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `TiTan-Gemma3-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `TiTan-Gemma3-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `TiTan-Gemma3-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `TiTan-Gemma3-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `TiTan-Gemma3-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
tinyllama-Evol-Instruct
PositiveDetox-Qwen3-4B
WorldBuilder-7B-GGUF
Rewiz-Tom-7B
A fine-tuned 7B parameter model specialized in reasoning (Rewiz), based on a model that was already finetuned for step-by-step instruction and conversation (Tom). This model is a fine-tuned version of theprint/Tom-Qwen-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: apache-2.0 - Base model: theprint/Tom-Qwen-7B-Instruct - Fine-tuning method: LoRA with rank 128 Conversation, brainstorming, and general instruction following The Rewiz data set is a curated mix of 20,000 reasoning-based entries. - Training epochs: 2 - LoRA rank: 128 - Learning rate: 0.0002 - Batch size: 4 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 Quantized GGUF versions are available in the `gguf/` directory for use with llama.cpp: - `Rewiz-Tom-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `Rewiz-Tom-7B-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `Rewiz-Tom-7B-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `Rewiz-Tom-7B-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `Rewiz-Tom-7B-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `Rewiz-Tom-7B-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) - Base model: theprint/Tom-Qwen-7B-Instruct - Training dataset: theprint/ReWiz - Fine-tuning framework: Unsloth - Quantization: llama.cpp
Hemispheres-v0.2-Gemma2-9B-Combo
Math-Coma-7B
The theprint/MathTutor-7B model further finetuned on natural reasoning using GRPO. This is an experimental model and likely to hallucinate. - Developed by: theprint - License: apache-2.0 - Finetuned from model : theprint/MathTutor-7B This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
Hemispheres-7B-Right
RRT1-3B
ThinkMix-Gemma3-4B-GRPO-GGUF
Llama3.1-8B-CodeThink-GGUF
DevilsAdvocate-7B-GGUF
Quantized GGUF versions of DevilsAdvocate-7B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuned model: theprint/DevilsAdvocate-7B - Quantized by: theprint - `DevilsAdvocate-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-7B-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-7B-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-7B-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-7B-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-7B-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
CleverBoi-Phi-3.5-mini-instruct
Mistral-7b-Instruct-v0.2-python-18k
Genuine-7B-Instruct-GGUF
Quantized GGUF versions of Genuine-7B-Instruct for use with llama.cpp and other GGUF-compatible inference engines. - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuned model: theprint/Genuine-7B-Instruct - Quantized by: theprint - `Genuine-7B-Instruct-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `Genuine-7B-Instruct-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `Genuine-7B-Instruct-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `Genuine-7B-Instruct-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `Genuine-7B-Instruct-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `Genuine-7B-Instruct-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
Rewiz-Gemma3-1B-GGUF
Genuine-Zeth-4B-GGUF
Genuine-1B-GGUF
Quantized GGUF versions of Genuine-1B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: google/gemma-3-1b-it - Fine-tuned model: theprint/Genuine-1B - Quantized by: theprint - `Genuine-1B-f16.gguf` (2489.6 MB) - 16-bit float (original precision, largest file) - `Genuine-1B-q3km.gguf` (850.9 MB) - 3-bit quantization (medium quality) - `Genuine-1B-q4km.gguf` (966.7 MB) - 4-bit quantization (medium, recommended for most use cases) - `Genuine-1B-q5km.gguf` (1027.9 MB) - 5-bit quantization (medium, good quality) - `Genuine-1B-q6k.gguf` (1270.9 MB) - 6-bit quantization (high quality) - `Genuine-1B-q80.gguf` (1325.8 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
Qwen2.5-14B-GameDev-GGUF
- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-14b-unsloth-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
ReasonableMath-Llama-3.2-3B-Instruct-GGUF
Hemispheres-Llama3.1-8B-Combo
RuDolph-Hermes-7B-Q6_K-GGUF
DevilsAdvocate-8B-GGUF
Quantized GGUF versions of DevilsAdvocate-8B for use with llama.cpp and other GGUF-compatible inference engines. - Base model: Qwen/Qwen3-8B - Fine-tuned model: theprint/DevilsAdvocate-8B - Quantized by: theprint - `DevilsAdvocate-8B-f16.gguf` (15628.9 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-8B-q3km.gguf` (3933.1 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-8B-q4km.gguf` (4794.9 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-8B-q5km.gguf` (5580.1 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-8B-q6k.gguf` (6414.3 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-8B-q80.gguf` (8306.0 MB) - 8-bit quantization (very high quality) These files are compatible with: - llama.cpp - Ollama (import as custom model) - KoboldCpp - text-generation-webui Recommended: `q4km` provides the best balance of size, speed, and quality for most use cases. For maximum quality: Use `q80` or `f16` For maximum speed/smallest size: Use `q3km` or `q4ks`
PyRe-3B-v1-GGUF
PyRe-3B-v2-GGUF
Please note that this model is a WIP experiment into GRPO fine tuning on Python code problems for reasoning. The performance of this model varies greatly depending on task, prompt and parameters. I recommend a very low temperature, like 0.1. You may also see more consistent results by encouraging the use of ` ` and ` ` tags in the system prompt. - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/llama-3.2-3b-instruct-unsloth-bnb-4bit This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
Coma-7B
TextSynth-Gemma3-12B
RuDolph-Hermes-7B-Q4_K_S-GGUF
Hemispheres-Qwen2.5-3B-Left
- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-3b-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
Coma-3B
Coma is based on Qwen 2.5 3B, GRPO-fine tuned on the natural reasoning data set from Meta. There are quantized versions available at theprint/Coma-3B-GGUF in GGUF format. The following system prompt was used in testing of the model: Testing was done at `temperature=1.0`, `topk=45` and `topp=0.95`. - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
ReWiz-Worldbuilder-7B
Genuine-Phi4
This is a phi 4 (14B) model, fine tuned for more engaging conversation, to limit sycofancy in language models and encouraging the models to (gently) push back and call out bad ideas. Intended Use Brainstorming, idea development, general conversation - Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/phi-4-unsloth-bnb-4bit This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
Genuine-Phi4-Q4_K_M-GGUF
ThinkMix-Gemma3-4B
WorldBuilder-7B
- Developed by: theprint - License: apache-2.0 - Finetuned from model : unsloth/mistral-7b-v0.3-bnb-4bit This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.
RuDolph-Hermes-7B
ReWiz-Nemo-12B-Instruct
WorldBuilder-12B
PyRe-3B-v1
TextSynth-8B
Gemma3-CP1-12B
TiTan-Qwen2.5-0.5B
PyRe-Llama8.1-8B
Llama3.1-8B-CodeThink-16bit
Zeth-Gemma3-4B
A fine-tuned Gemma3 4B model, specialized in pragmatic empathy, or perhaps it is empathic pragmatism? This model is a fine-tuned version of google/gemma-3-4b-it using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: apache-2.0 - Base model: google/gemma-3-4b-it - Fine-tuning method: LoRA with rank 128 Conversation, brainstorming, and general instruction following. Quantized GGUF versions are available at theprint/Zeth-Gemma3-4B-GGUF: - `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file) - `Zeth-Gemma3-4B-q3km.gguf` (2276.3 MB) - 3-bit quantization (medium quality) - `Zeth-Gemma3-4B-q4km.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases) - `Zeth-Gemma3-4B-q5km.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality) - `Zeth-Gemma3-4B-q6k.gguf` (3568.1 MB) - 6-bit quantization (high quality) - `Zeth-Gemma3-4B-q80.gguf` (4619.2 MB) - 8-bit quantization (very high quality) The Zeth data set was specifically created for finetuning models on empathic explanation. This was done by taking premade data sets and rewording the replies to be in line with the style for Zeth. - Training epochs: 3 - LoRA rank: 128 - Learning rate: 0.0002 - Batch size: 4 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 - Base model: google/gemma-3-4b-it - Training dataset: theprint/Zeth - Fine-tuning framework: Unsloth - Quantization: llama.cpp
Genuine-7B-Instruct
A fine-tuned Qwen 2.5 7B Instruct model, tuned for more engaging conversation with fewer sycofant responses. This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: apache-2.0 - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuning method: LoRA with rank 128 Brainstorming, idea development, general conversation Quantized GGUF versions are available in the theprint/Genuine-7B-Instruct-GGUF repo. - `Genuine-7B-Instruct-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `Genuine-7B-Instruct-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `Genuine-7B-Instruct-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `Genuine-7B-Instruct-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `Genuine-7B-Instruct-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `Genuine-7B-Instruct-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) This data set was created to limit sycofancy in language models and encouraging the models to (gently) push back and call out bad ideas. - Dataset: theprint/Gentle-Pushback-8.5k-alpaca - Format: alpaca - Training epochs: 2 - LoRA rank: 128 - Learning rate: 0.0001 - Batch size: 6 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 - Base model: Qwen/Qwen2.5-7B-Instruct - Training dataset: theprint/Gentle-Pushback-8.5k-alpaca - Fine-tuning framework: Unsloth - Quantization: llama.cpp
DevilsAdvocate-7B
A fine-tuned Qwen 2.5 7B model, fine tuned for more engaging conversation, encouraging the user to think about different aspects. This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - Developed by: theprint - Model type: Causal Language Model (Fine-tuned with LoRA) - Language: en - License: mit - Base model: Qwen/Qwen2.5-7B-Instruct - Fine-tuning method: LoRA with rank 128 Brainstorming, idea development, general conversation Quantized GGUF versions are available in the theprint/DevilsAdvocate-7B-GGUF repo. - `DevilsAdvocate-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) - `DevilsAdvocate-7B-q3km.gguf` (3632.0 MB) - 3-bit quantization (medium quality) - `DevilsAdvocate-7B-q4km.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) - `DevilsAdvocate-7B-q5km.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) - `DevilsAdvocate-7B-q6k.gguf` (5964.5 MB) - 6-bit quantization (high quality) - `DevilsAdvocate-7B-q80.gguf` (7723.4 MB) - 8-bit quantization (very high quality) This data set was created to limit sycofancy and encourage gentle pushback in language models. - Training epochs: 2 - LoRA rank: 128 - Learning rate: 0.0001 - Batch size: 1 - Framework: Unsloth + transformers + PEFT - Hardware: NVIDIA RTX 5090 - Base model: Qwen/Qwen2.5-7B-Instruct - Training dataset: theprint/Advocate-9.4k - Fine-tuning framework: Unsloth - Quantization: llama.cpp