ToastyPigeon

164 models • 1 total models in database
Sort by:

Qwen3.5-Test-GGUFs

1,231
1

muse-marvin-gguf

662
0

Qwen3-30B-A3B-AntiRep-2507-Q4_K_M-GGUF

ToastyPigeon/Qwen3-30B-A3B-AntiRep-2507-Q4KM-GGUF This model was converted to GGUF format from `ConicCat/Qwen3-30B-A3B-AntiRep-2507` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

NaNK
llama-cpp
187
0

granite-4.0-h-small-half-slerp-Q4_K_S-GGUF

llama-cpp
168
0

muse-margvin-gguf

108
0

muse-marvin-od-lora

NaNK
license:apache-2.0
50
0

medgemma-27b-text-it-abliterated

NaNK
39
0

Qwen3.5-27B-Antirep-V1

NaNK
28
0

cursed-test-ggufs

27
0

muse-marvin-32k-lora

This model is a fine-tuned version of LatitudeGames/Muse-12B on the ToastyPigeon/steve-and-marvin dataset. It achieves the following results on the evaluation set: - Loss: 2.5071 - Memory/max Active (gib): 4.98 - Memory/max Allocated (gib): 4.89 - Memory/device Reserved (gib): 6.9 The following hyperparameters were used during training: - learningrate: 1e-05 - trainbatchsize: 1 - evalbatchsize: 1 - seed: 69 - distributedtype: multi-GPU - numdevices: 2 - gradientaccumulationsteps: 4 - totaltrainbatchsize: 8 - totalevalbatchsize: 2 - optimizer: Use OptimizerNames.ADAMWTORCHFUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 10 - trainingsteps: 420 | Training Loss | Epoch | Step | Validation Loss | Active (gib) | Allocated (gib) | Reserved (gib) | |:-------------:|:------:|:----:|:---------------:|:------------:|:---------------:|:--------------:| | No log | 0 | 0 | 2.6286 | 8.04 | 6.73 | 8.36 | | 2.4233 | 0.0993 | 21 | 2.6047 | 4.98 | 4.89 | 6.9 | | 2.5581 | 0.1986 | 42 | 2.5627 | 4.98 | 4.89 | 6.9 | | 2.3368 | 0.2979 | 63 | 2.5447 | 4.98 | 4.89 | 6.9 | | 2.5579 | 0.3972 | 84 | 2.5328 | 4.98 | 4.89 | 6.9 | | 2.4241 | 0.4965 | 105 | 2.5253 | 4.98 | 4.89 | 6.9 | | 2.4608 | 0.5957 | 126 | 2.5199 | 4.98 | 4.89 | 6.9 | | 2.8143 | 0.6950 | 147 | 2.5156 | 4.98 | 4.89 | 6.9 | | 2.6305 | 0.7943 | 168 | 2.5129 | 4.98 | 4.89 | 6.9 | | 2.3989 | 0.8936 | 189 | 2.5105 | 4.98 | 4.89 | 6.9 | | 2.6816 | 0.9929 | 210 | 2.5096 | 4.98 | 4.89 | 6.9 | | 2.629 | 1.0898 | 231 | 2.5092 | 4.98 | 4.89 | 6.9 | | 2.4645 | 1.1891 | 252 | 2.5088 | 4.98 | 4.89 | 6.9 | | 2.3738 | 1.2884 | 273 | 2.5081 | 4.98 | 4.89 | 6.9 | | 2.3651 | 1.3877 | 294 | 2.5076 | 4.98 | 4.89 | 6.9 | | 2.4476 | 1.4870 | 315 | 2.5073 | 4.98 | 4.89 | 6.9 | | 2.4091 | 1.5863 | 336 | 2.5072 | 4.98 | 4.89 | 6.9 | | 2.6352 | 1.6856 | 357 | 2.5071 | 4.98 | 4.89 | 6.9 | | 2.5311 | 1.7849 | 378 | 2.5071 | 4.98 | 4.89 | 6.9 | | 2.5747 | 1.8842 | 399 | 2.5071 | 4.98 | 4.89 | 6.9 | | 2.3871 | 1.9835 | 420 | 2.5071 | 4.98 | 4.89 | 6.9 | - PEFT 0.17.1 - Transformers 4.56.1 - Pytorch 2.7.1+cu126 - Datasets 4.0.0 - Tokenizers 0.22.1

NaNK
license:apache-2.0
24
0

another-gemma-12b-lora-part1

- Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

NaNK
24
0

Gemma 3 Starshine 12B

A creative writing model based on a merge of fine-tunes on Gemma 3 12B IT and Gemma 3 12B PT. This is the Story Focused merge. This version works better for storytelling and scenarios, as the prose is more novel-like and it has a tendency to impersonate the user character. This is a merge of two G3 models, one trained on instruct and one trained on base: allura-org/Gemma-3-Glitter-12B - Itself a merge of a storywriting and RP train (both also by ToastyPigeon), on instruct ToastyPigeon/Gemma-3-Confetti-12B - Experimental application of the Glitter data using base instead of instruct, additionally includes some adventure data in the form of SpringDragon. The result is a lovely blend of Glitter's ability to follow instructions and Confetti's free-spirit prose, effectively 'loosening up' much of the hesitancy that was left in Glitter. Thank you to jebcarter for the idea to make this. I love how it turned out! Uses Gemma2/3 instruct, but has been trained to recognize an optional system role. Note: While it won't immediately balk at the system role, results may be better without it. Yeah, I actually tried several things and surprisingly this one worked best.

NaNK
21
20

ms-test-models

20
0

probably-terrible-gemma-12b-Q6_K-GGUF

NaNK
llama-cpp
19
0

muse-marvin-lora-2

NaNK
license:apache-2.0
17
0

probably-terrible-gemma-12b

NaNK
13
1

muse-marvin-Q8_0-GGUF

llama-cpp
13
0

probably-broken-glm-Q4_K_S-GGUF

llama-cpp
12
0

another-qwen-test-model-Q6_K-GGUF

ToastyPigeon/another-qwen-test-model-Q6K-GGUF This model was converted to GGUF format from `ToastyPigeon/another-qwen-test-model` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp
12
0

half-granite-marvin-Q4_K_S-GGUF

llama-cpp
12
0

muse-marvin-od2-lora

12
0

apertus-ffn-1

12
0

tess-books-4-Q6_K-GGUF

NaNK
llama-cpp
11
0

nemo-instruct-books-Q6_K-GGUF

ToastyPigeon/nemo-instruct-books-Q6K-GGUF This model was converted to GGUF format from `ToastyPigeon/nemo-instruct-books` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp
11
0

muse-marvin-ffn-lora

NaNK
license:apache-2.0
11
0

qwen3-16b-a3b-v3-iter3-Q6_K-GGUF

NaNK
llama-cpp
10
0

i-added-glitter-Q4_K_S-GGUF

ToastyPigeon/i-added-glitter-Q4KS-GGUF This model was converted to GGUF format from `ToastyPigeon/i-added-glitter` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp
10
0

a-funny-nemo-merge-Q6_K-GGUF

llama-cpp
10
0

glitterdex-1epoch-test-merged-Q5_K_M-GGUF

llama-cpp
9
0

mn-12b-mm-tess-books-Q6_K-GGUF

NaNK
llama-cpp
9
0

a-strange-nemo-model-Q6_K-GGUF

llama-cpp
9
0

funny-nemo-embed-testing-3

A creative writing model based on Mistral Nemo 12B to support co-writing and other related longform writing tasks. This is pretty good, actually. Smarter than some other nemos I've tried and with decent samplers it's not very sloppy. Working samplers: temp 1.25-1.5, min-p 0.02-0.05, rep pen 1.01, temp first. Feels like some prompts need higher or lower temp than others. Lower temps result in sloppy mistral-isms, higher temps tap into the lora training a bit more. Chat template is theoretically ChatML because of the base models used in the merge. However the ChatML-Names preset in SillyTavern often gives better results, YMMV. With ChatML-Names in particular this is good at copying the style of what's already in the chat history. So if your chat history is sloppy, this likely will be too (use XTC for a bit to break it up). If your chat history isn't sloppy, this is less likely to introduce any extra. Start a conversation off with text from a good model (or better yet, human-written text), and this should follow along easily. Has the same pacing issues any Nemo model does when asked to compose a longform story from scratch via instruct, though better than some others. Seems like it's good at dialogue (though it has a bias towards country and/or british style English accents if unspecified), and is good at 'reading between the lines' for its size as well. I did not include any erotica or other NSFW data in the LoRA training parts of this; however, Mag-Mell contains Magnum (and Chronos, which is trained on top of a rejected Magnum) so the capability is there if you need it (it just might be a bit Claude-slop-y as I haven't optimized this part for style). The two LoRAs on this were trained at 8k (nemo-kimi-lora) and 32k (nemo-books-lora) context. As you might guess, nemo-kimi-lora is trained on outputs from kimi-k2 (dataset is public on my profile), and nemo-books-lora is trained on a bunch of books. This is a merge of pre-trained language models created using mergekit. This model was merged using the Linear merge method. The following models were included in the merge: inflatebot/MN-12B-Mag-Mell-R1 + ToastyPigeon/nemo-kimi-lora migtissera/Tess-3-Mistral-Nemo-12B + ToastyPigeon/nemo-books-lora The following YAML configuration was used to produce this model:

NaNK
9
0

muse-marvin-attn-lora

This model is a fine-tuned version of LatitudeGames/Muse-12B on the grimulkan/LimaRP-augmented, the ToastyPigeon/steve-and-marvin and the ToastyPigeon/kimi-stories-completion datasets. It achieves the following results on the evaluation set: - Loss: 2.4268 - Memory/max Active (gib): 5.02 - Memory/max Allocated (gib): 4.89 - Memory/device Reserved (gib): 6.64 The following hyperparameters were used during training: - learningrate: 1e-05 - trainbatchsize: 1 - evalbatchsize: 1 - seed: 69 - distributedtype: multi-GPU - numdevices: 2 - gradientaccumulationsteps: 4 - totaltrainbatchsize: 8 - totalevalbatchsize: 2 - optimizer: Use OptimizerNames.ADAMWTORCHFUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 5 - trainingsteps: 232 | Training Loss | Epoch | Step | Validation Loss | Active (gib) | Allocated (gib) | Reserved (gib) | |:-------------:|:------:|:----:|:---------------:|:------------:|:---------------:|:--------------:| | No log | 0 | 0 | 2.5323 | 8.04 | 6.73 | 8.36 | | 2.5888 | 0.1032 | 24 | 2.4883 | 5.02 | 4.89 | 6.64 | | 2.4142 | 0.2065 | 48 | 2.4537 | 5.02 | 4.89 | 6.64 | | 2.3697 | 0.3097 | 72 | 2.4418 | 5.02 | 4.89 | 6.64 | | 2.2986 | 0.4129 | 96 | 2.4354 | 5.02 | 4.89 | 6.64 | | 2.5054 | 0.5161 | 120 | 2.4314 | 5.02 | 4.89 | 6.64 | | 2.6863 | 0.6194 | 144 | 2.4290 | 5.02 | 4.89 | 6.64 | | 2.3196 | 0.7226 | 168 | 2.4277 | 5.02 | 4.89 | 6.64 | | 2.3422 | 0.8258 | 192 | 2.4271 | 5.02 | 4.89 | 6.64 | | 2.5976 | 0.9290 | 216 | 2.4268 | 5.02 | 4.89 | 6.64 | - PEFT 0.17.1 - Transformers 4.56.1 - Pytorch 2.7.1+cu126 - Datasets 4.0.0 - Tokenizers 0.22.1

NaNK
license:apache-2.0
9
0

Gemma-3-Starshine-12B-Alt

NaNK
8
7

g3-12b-storyteller-v0.2-textonly-Q6_K-GGUF

NaNK
llama-cpp
8
1

other-test-models

8
0

new-ms-rp-test-ws

NaNK
license:apache-2.0
8
0

medgemma-ero-healmerged-Q4_K_S-GGUF

ToastyPigeon/medgemma-ero-healmerged-Q4KS-GGUF This model was converted to GGUF format from `allura-forge/medgemma-ero-healmerged` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp
8
0

GLM-Tulu-ChatML-Q4_K_S-GGUF

llama-cpp
8
0

possibly-working-glm-Q4_K_S-GGUF

llama-cpp
8
0

muse-marvin-lora

This model is a fine-tuned version of LatitudeGames/Muse-12B on the grimulkan/LimaRP-augmented, the ToastyPigeon/steve-and-marvin and the ToastyPigeon/kimi-stories-completion datasets. It achieves the following results on the evaluation set: - Loss: 2.3857 - Memory/max Active (gib): 31.3 - Memory/max Allocated (gib): 31.3 - Memory/device Reserved (gib): 32.18 The following hyperparameters were used during training: - learningrate: 1e-05 - trainbatchsize: 6 - evalbatchsize: 6 - seed: 69 - optimizer: Use OptimizerNames.PAGEDADAMW8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 7 - trainingsteps: 309 | Training Loss | Epoch | Step | Validation Loss | Active (gib) | Allocated (gib) | Reserved (gib) | |:-------------:|:-----:|:----:|:---------------:|:------------:|:---------------:|:--------------:| | No log | 0 | 0 | 2.4839 | 31.29 | 31.29 | 32.17 | | 2.5492 | 0.1 | 31 | 2.4235 | 31.31 | 31.31 | 32.18 | | 2.3906 | 0.2 | 62 | 2.4048 | 31.32 | 31.32 | 32.18 | | 2.2984 | 0.3 | 93 | 2.3961 | 31.31 | 31.31 | 32.18 | | 2.4423 | 0.4 | 124 | 2.3916 | 31.31 | 31.31 | 32.18 | | 2.4106 | 0.5 | 155 | 2.3889 | 31.3 | 31.3 | 32.18 | | 2.526 | 0.6 | 186 | 2.3875 | 31.3 | 31.3 | 32.18 | | 2.3574 | 0.7 | 217 | 2.3863 | 31.3 | 31.3 | 32.18 | | 2.4005 | 0.8 | 248 | 2.3858 | 31.3 | 31.3 | 32.18 | | 2.4227 | 0.9 | 279 | 2.3857 | 31.3 | 31.3 | 32.18 | - PEFT 0.17.1 - Transformers 4.56.1 - Pytorch 2.8.0+cu128 - Datasets 4.0.0 - Tokenizers 0.22.1

NaNK
license:apache-2.0
8
0

muse-marvin-stage3-lora

8
0

another-glm-train-2-epochs-Q4_K_S-GGUF

ToastyPigeon/another-glm-train-2-epochs-Q4KS-GGUF This model was converted to GGUF format from `ToastyPigeon/another-glm-train-2-epochs` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp
7
0

medgemma-27b-abliterated-multimodal

NaNK
6
2

Llama-3-8B-Instruct-SpringDragon-V2-QLoRA

NaNK
base_model:NousResearch/Meta-Llama-3-8B-Instruct
6
1

mistral-small-springdragon-qlora

6
1

funny-nemo-embedding-testing

This is a merge of pre-trained language models created using mergekit. This model was merged using the Linear merge method. The following models were included in the merge: magmell + ToastyPigeon/nemo-kimi-lora tess + ToastyPigeon/nemo-books-lora The following YAML configuration was used to produce this model:

NaNK
6
1

mistral-small-dampf-qlora

NaNK
6
0

Qwen3-Gutenberg-Encore-14B-Q6_K-GGUF

ToastyPigeon/Qwen3-Gutenberg-Encore-14B-Q6K-GGUF This model was converted to GGUF format from `nbeerbower/Qwen3-Gutenberg-Encore-14B` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

NaNK
llama-cpp
6
0

another-strange-nemo-model-Q6_K-GGUF

llama-cpp
6
0

nemo-books-lora-4

6
0

nemo-kimi-lora-2e-larger

6
0

middle-stage-qwen

6
0

possibly-cursed-glm-test

license:mit
5
1

new-ms-rp-test-v2-ws

NaNK
5
0

qwen21b-creative-Q4_K_S-GGUF

NaNK
llama-cpp
5
0

gemma3-27b-starlike-v2-Q4_K_S-GGUF

ToastyPigeon/gemma3-27b-starlike-v2-Q4KS-GGUF This model was converted to GGUF format from `ToastyPigeon/gemma3-27b-starlike-v2` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

NaNK
llama-cpp
5
0

nemo-kimi-lora

NaNK
license:apache-2.0
5
0

Mistral-Nemo-12B-Adventure-QLoRA

NaNK
license:apache-2.0
4
2

QwQ-32B-Snowdrop-v0-EmbedFix

NaNK
4
1

tess-books-4

4
1

nemo-instruct-books

4
1

muse-marvin

4
1

intern-rp-lora

NaNK
license:apache-2.0
4
0

q3-14b-completion-lora

This model is a fine-tuned version of Qwen/Qwen3-14B-Base on the ToastyPigeon/new-story-dataset, the ToastyPigeon/new-story-dataset, the ToastyPigeon/some-erotica, the ToastyPigeon/skein-text-adventures, the ToastyPigeon/SpringDragon and the ToastyPigeon/disco-chat datasets. The following hyperparameters were used during training: - learningrate: 1e-05 - trainbatchsize: 1 - evalbatchsize: 1 - seed: 69 - distributedtype: multi-GPU - numdevices: 2 - gradientaccumulationsteps: 8 - totaltrainbatchsize: 16 - totalevalbatchsize: 2 - optimizer: Use OptimizerNames.ADAMWTORCHFUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 20 - numepochs: 2.0 - PEFT 0.15.2 - Transformers 4.51.3 - Pytorch 2.7.1+cu126 - Datasets 3.5.0 - Tokenizers 0.21.1

NaNK
license:apache-2.0
4
0

another-possibly-cursed-glm-checkpoint

license:mit
4
0

nemo-books-lora

4
0

half-granite-marvin

- Developed by: ToastyPigeon - License: apache-2.0 - Finetuned from model : Columbidae/granite-4.0-h-small-half-slerp This granitemoehybrid model was trained 2x faster with Unsloth and Huggingface's TRL library.

license:apache-2.0
4
0

half-granite-marvin-Q4_K_M-GGUF

ToastyPigeon/half-granite-marvin-Q4KM-GGUF This model was converted to GGUF format from `ToastyPigeon/half-granite-marvin` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp
4
0

muse-marvin-stage2-lora

4
0

SpringDragon-NeMo-Instruct-QLoRA-ep1

NaNK
license:apache-2.0
3
2

Meta-Llama-3.1-8B-Adventure-QLoRA

NaNK
llama
3
1

qwen-story-test-qlora

NaNK
license:apache-2.0
3
1

Qwen2.5-14B-Instruct-1M-Unalign

A simple unalignment fine-tune on ~900k tokens aiming to make the model more compliant and willing to handle user requests. This is the same unalignment training seen in concedo/Beepo-22B, so big thanks to concedo for the dataset.

NaNK
3
1

mn-12b-impersonation-city

This is a merge of pre-trained language models created using mergekit. This model was merged using the Linear merge method. The following models were included in the merge: LatitudeGames/Muse-12B muse-writer nbeerbower/mistral-nemo-gutenberg-12B The following YAML configuration was used to produce this model:

NaNK
3
1

mn-12b-mm-tess-books

NaNK
3
1

an-cleaner

3
0

ms-type1-adventure

3
0

tq14b-1m-gutenberg-sft

NaNK
3
0

qwen14-creative-epoch1-Q4_K_S-GGUF

NaNK
llama-cpp
3
0

ms3-roselily-rp-v3-Q4_K_S-GGUF

NaNK
llama-cpp
3
0

g3-4b-it-creative-qlora

NaNK
3
0

gemma-3-starshine-12b-continued

NaNK
3
0

gemma3-27b-starlike-v3-Q4_K_S-GGUF

NaNK
llama-cpp
3
0

anti-star-maybe-stabilized-Q4_K_S-GGUF

llama-cpp
3
0

glitterdex-1epoch-test-merged

3
0

nemo-kink-lora

3
0

nemo-books-lora-2

3
0

a-strange-nemo-model

3
0

nemo-books-lora-3

3
0

glm-books-lora-wonky

NaNK
3
0

probably-broken-glm

3
0

possibly-working-glm

3
0

nemo-kimi-lora-2e

NaNK
3
0

another-qwen-test-model

3
0

psyonic-cetacean-20b-v2

NaNK
llama
2
3

i-added-glitter

This is a merge of pre-trained language models created using mergekit. This model was merged using the Linear merge method. The following models were included in the merge: ToastyPigeon/anti-starlike allura-org/Gemma-3-Glitter-27B The following YAML configuration was used to produce this model:

NaNK
2
2

MS-Meadowlark-Alt-22B

NaNK
2
1

MS3-24B-MarbleRye

NaNK
2
1

g3-12b-it-story-qlora

NaNK
2
1

command-r-32b-Adventure-LoRA

NaNK
2
0

supernova-medius-adventure-s-qlora

license:apache-2.0
2
0

tq14-unalign-test-ws

2
0

qwen32-rp-ws

NaNK
license:apache-2.0
2
0

ms3-roselily-rp-Q4_K_S-GGUF

llama-cpp
2
0

gemma-2-24b-retrained-base-adapter

NaNK
2
0

g3-12b-it-unalign-epoch2-Q6_K-GGUF

NaNK
llama-cpp
2
0

g3-12b-storyteller-v0.1-epoch1-Q6_K-GGUF

NaNK
llama-cpp
2
0

g3-27b-part1-glitter-Q4_K_S-GGUF

NaNK
llama-cpp
2
0

g3-27b-merge-B-Q4_K_S-GGUF

NaNK
llama-cpp
2
0

g3-27b-beepo-mmtest-Q4_K_S-GGUF

NaNK
llama-cpp
2
0

starshine-simpo-test-1-Q6_K-GGUF

NaNK
llama-cpp
2
0

starshine-simpo-test-2-Q6_K-GGUF

NaNK
llama-cpp
2
0

starshine-simpo-test-3-Q6_K-GGUF

NaNK
llama-cpp
2
0

glm4-glimmer-v0-merged-idkifthiswillwork-Q4_K_S-GGUF

ToastyPigeon/glm4-glimmer-v0-merged-idkifthiswillwork-Q4KS-GGUF This model was converted to GGUF format from `ToastyPigeon/glm4-glimmer-v0-merged-idkifthiswillwork` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp
2
0

gemma-3-27b-medglitter-ero-Q4_K_S-GGUF

NaNK
llama-cpp
2
0

negative-starlike-v2-Q4_K_S-GGUF

ToastyPigeon/negative-starlike-v2-Q4KS-GGUF This model was converted to GGUF format from `ToastyPigeon/negative-starlike-v2` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

NaNK
llama-cpp
2
0

sparkly-3.2-train

workspace/aibox-standalone-pool/axolotl/glitterms32-v2-ckpts This model is a fine-tuned version of Gryphe/Codex-24B-Small-3.2 on the ToastyPigeon/cowriter-instruct, the allura-org/EU01-S2, the allenai/tulu-3-sft-personas-instruction-following, the ToastyPigeon/mixed-medical-reasoning-formatted, the ToastyPigeon/steve-and-marvin, the ToastyPigeon/new-story-dataset, the allura-org/fujin-instruct-v2, the ToastyPigeon/some-rp-extended, the ToastyPigeon/gutenberg-sft, the ToastyPigeon/SpringDragon and the ToastyPigeon/some-erotica datasets. The following hyperparameters were used during training: - learningrate: 1e-05 - trainbatchsize: 1 - evalbatchsize: 1 - seed: 69 - distributedtype: multi-GPU - numdevices: 2 - gradientaccumulationsteps: 8 - totaltrainbatchsize: 16 - totalevalbatchsize: 2 - optimizer: Use OptimizerNames.PAGEDADAMW8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: cosine - trainingsteps: 10 - PEFT 0.15.2 - Transformers 4.51.3 - Pytorch 2.7.0+cu128 - Datasets 3.5.1 - Tokenizers 0.21.1

NaNK
license:apache-2.0
2
0

another-strange-nemo-model

2
0

psyonic-cetacean-20b-v2-4.0bpw-h6-exl2

NaNK
llama
1
1

BlackMagic-7B

NaNK
1
1

mistral-small-adventure-qlora

NaNK
1
1

granite-3.3-8b-creative

NaNK
1
1

ms-type2-rp

1
0

TQ2.5-0.5B-Summary-ep1

NaNK
1
0

not-for-human-consumption

1
0

ms-rp-test-revisit-e1

1
0

qwen2.5-32b-unnamed-test-model

NaNK
1
0

g2-9b-creative-16k-Q6_K-GGUF

ToastyPigeon/g2-9b-creative-16k-Q6K-GGUF This model was converted to GGUF format from `Columbidae/g2-9b-creative-16k` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

NaNK
llama-cpp
1
0

gemma-2-24b-instruct-0.5ep-Q4_K_S-GGUF

ToastyPigeon/gemma-2-24b-instruct-0.5ep-Q4KS-GGUF This model was converted to GGUF format from `Columbidae/gemma-2-24b-instruct-0.5ep` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

NaNK
llama-cpp
1
0

g3-12b-storyteller-v0.1-epoch1

NaNK
1
0

g3-12b-multimerge-test

NaNK
1
0

g3-12b-pt-inkstruct-epoch1-mm-Q6_K-GGUF

NaNK
llama-cpp
1
0

g3-12b-inkstructfetti-Q6_K-GGUF

NaNK
llama-cpp
1
0

g3-27b-part1-glitter

NaNK
1
0

g3-27b-merge-A-Q4_K_S-GGUF

NaNK
llama-cpp
1
0

gemma-3-27b-experiment-storyteller

NaNK
1
0

another-gemma3-abomination

1
0

gemma3-negative-starlike-Q4_K_S-GGUF

llama-cpp
1
0

negative-starlike-v2

This is a merge of pre-trained language models created using mergekit. This model was merged using the Linear merge method. The following models were included in the merge: ToastyPigeon/gemma3-27b-glitterlike-v2 ToastyPigeon/negative-confetti The following YAML configuration was used to produce this model:

NaNK
1
0

anti-starlike-Q4_K_S-GGUF

ToastyPigeon/anti-starlike-Q4KS-GGUF This model was converted to GGUF format from `ToastyPigeon/anti-starlike` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp
1
0

qwen3-18b-completion-trained-Q6_K-GGUF

ToastyPigeon/qwen3-18b-completion-trained-Q6K-GGUF This model was converted to GGUF format from `allura-forge/qwen3-18b-completion-trained` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

NaNK
llama-cpp
1
0

a-glm-train-mid-backup

40% Epoch checkpoint (~40M tokens seen). Producing some interesting output but inconsistent, potential target for stabilizing RL. Saving this in case it gets worse later.

NaNK
1
0

a-glm-train-0.7ep-backup

NaNK
1
0

another-glm-train

- Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

NaNK
1
0

glm-books-qlora-2-2ep

NaNK
license:mit
1
0

another-glm-train-2-epochs

1
0

ST-Presets-Mistral-Small

0
7

Beeper-King-22B

NaNK
0
7

g3-12b-storyteller-v0.2-textonly

NaNK
0
6

Gemma-3-Confetti-12B

NaNK
0
3

Qwen3.5-27B-Marvin-DPO-V2

NaNK
0
2

SpringDragon-NeMo-QLoRA-ep1

NaNK
0
1

Captain-Adventure-32B

NaNK
0
1

qwen-rp-test-h-qlora

license:apache-2.0
0
1

Sto-vo-kor-12B-LoRA

NaNK
0
1

g3-12b-rp-system-v0.1

NaNK
0
1

gemma-3-27b-experiment-v2-merge-B

NaNK
0
1

nemo-12b-instruct-creative

NaNK
0
1

Qwen3-16B-A3B-MixedData

NaNK
0
1

gemma3-27b-v2-confettilike

NaNK
0
1

gemma3-27b-v2-starlike

This is a merge of pre-trained language models created using mergekit. This model was merged using the Linear merge method. The following models were included in the merge: confettilike-mm glitterlike-mm The following YAML configuration was used to produce this model:

NaNK
0
1

gemma3-27b-glitterlike-v2

NaNK
0
1

gemma3-negative-glitter

This is a merge of pre-trained language models created using mergekit. This model was merged using the Linear merge method. The following models were included in the merge: ToastyPigeon/gemma3-27b-v2-glitterlike ToastyPigeon/medgemma-27b-abliterated-multimodal The following YAML configuration was used to produce this model:

NaNK
0
1