Doctor-Shotgun

79 models • 2 total models in database
Sort by:

MS3.2-24B-Magnum-Diamond-GGUF

GGUF quantization of Doctor-Shotgun/MS3.2-24B-Magnum-Diamond using llama.cpp. Please refer to the linked model for full description. This model follows the Mistral v7 Tekken prompt format. Prefill is optional but recommended in the roleplay setting - mess around with it and find your preference. A typical input would look like this: Many inference libraries have the option to automatically prepend the BOS token ` `. For sampler settings, I'd recommend starting with a simple: Here are my customized SillyTavern presets for Magnum. Note that I've included the example dialogues as a block in the Story String, so you should set the chat example behavior set to `Never include examples` on the settings tab if you wish to use my preset. Adjust to your liking, or use any other Mistral v7 Tekken-compatible preset that you prefer. Prefill (Last Assistant Prefix) can be modified to your liking. SillyTavern JSON - Magnum Mistral v7 Tekken No Names SillyTavern JSON - Magnum Mistral v7 Tekken Prefill SillyTavern JSON - Magnum Mistral v7 Tekken No Names Prefill Thank you to gum1h0x (X/HF) for providing the compute used for training. Thank you to PocketDoc for the advanced prompt building strategy. Thank you to Delta-Vector and intervitens for testing this on 12B. Thank you to Gryphe for his advice on training rsLoRA from his experience training his own excellent models. Thank you to Sao10K for inspiring the Magnum series with his Euryale line of models. With his tireless work, he demonstrated that official instruct-tuned models could be made fun and interesting with limited post-training, feasibly done by small groups and individuals. Thank you to the members of Anthracite for the datasets and support. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice.

NaNK
license:apache-2.0
4,212
16

Misc-Models

2,954
4

CalliopeDS-L2-13B

NaNK
llama
1,140
7

L3.3-70B-Magnum-Diamond-GGUF

NaNK
license:llama3.3
940
3

Nous-Hermes-Llama2-13b-Kimiko-Lora-Merged

NaNK
llama
725
8

airoboros-2.2.1-y34b

NaNK
llama
724
2

mythospice-limarp-70b

NaNK
llama
721
2

mythospice-70b

NaNK
llama
719
3

limarpv3-llama2-70b-qlora

NaNK
llama
715
2

CalliopeDS-v2-L2-13B

NaNK
llama
715
0

MS3.2-24B-Magnum-Diamond

Magnum "Diamond" in reference to the intense heat and pressure (generated through matrix multiplications) needed to turn the coal-esque material of dry, assistant-tuned models into creative writing gems! This model is finetuned from a text-only conversion of mistralai/Mistral-Small-3.2-24B-Instruct-2506 as an rsLoRA adapter. It uses the same data mix as Doctor-Shotgun/L3.3-70B-Magnum-v5-SFT-Alpha, however with pre-tokenization and modifications to the custom loss masking. The goal was to re-create the model at a smaller, more consumer-friendly size. This model should perform competently with or without prepending character names, and with or without prefill. The objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale, so don't be surprised to see "Claude-isms" in its output. This is a minor version update over Doctor-Shotgun/MS3.1-24B-Magnum-Diamond utilizing the new official instruct model from June 2025. Here's the rsLoRA adapter for those merge-makers out there to play with. This model follows the Mistral v7 Tekken prompt format. Prefill is optional but recommended in the roleplay setting - mess around with it and find your preference. A typical input would look like this: Many inference libraries have the option to automatically prepend the BOS token ` `. For sampler settings, I'd recommend starting with a simple: Here are my customized SillyTavern presets for Magnum. Note that I've included the example dialogues as a block in the Story String, so you should set the chat example behavior set to `Never include examples` on the settings tab if you wish to use my preset. Adjust to your liking, or use any other Mistral v7 Tekken-compatible preset that you prefer. Prefill (Last Assistant Prefix) can be modified to your liking. SillyTavern JSON - Magnum Mistral v7 Tekken No Names SillyTavern JSON - Magnum Mistral v7 Tekken Prefill SillyTavern JSON - Magnum Mistral v7 Tekken No Names Prefill Thank you to gum1h0x (X/HF) for providing the compute used for training. Thank you to PocketDoc for the advanced prompt building strategy, as well as Delta-Vector and intervitens for helping experiment on it. Thank you to Gryphe for his advice on training rsLoRA from his experience training his own excellent models. Thank you to Sao10K for inspiring the Magnum series with his Euryale line of models. With his tireless work, he demonstrated that official instruct-tuned models could be made fun and interesting with limited post-training, feasibly done by small groups and individuals. Thank you to the members of Anthracite for the datasets and support. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice. There was a weird loss spike of unclear significance on one sample that was not seen using the same dataset on Mistral Small 3.1 Instruct, but the resulting model appears to be sane. The following hyperparameters were used during training: - learningrate: 2e-05 - trainbatchsize: 1 - evalbatchsize: 1 - seed: 42 - gradientaccumulationsteps: 16 - totaltrainbatchsize: 16 - optimizer: Use pagedademamix8bit and the args are: No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 40 - numepochs: 2.0

NaNK
license:apache-2.0
489
43

ML2-123B-Magnum-Diamond-GGUF

NaNK
418
4

TinyLlama-1.1B-32k-Instruct

NaNK
llama
394
13

L3.3-70B-Magnum-Diamond

Magnum "Diamond" in reference to the intense heat and pressure (generated through matrix multiplications) needed to turn the coal-esque material of dry, assistant-tuned models into creative writing gems! This model is finetuned from meta-llama/Llama-3.3-70B-Instruct as an rsLoRA adapter. It uses the same data mix as Doctor-Shotgun/L3.3-70B-Magnum-v5-SFT-Alpha, however with pre-tokenization and modifications to the custom loss masking. It's for all intents and purposes a version update to the former model. This model should perform competently with or without prepending character names, and with or without prefill. The objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale, so don't be surprised to see "Claude-isms" in its output. Here's the rsLoRA adapter for those merge-makers out there to play with. This model follows the Llama 3 prompt format. Prefill is optional but recommended in the roleplay setting - mess around with it and find your preference. A typical input would look like this: Many inference libraries have the option to automatically prepend the BOS token ` `. For sampler settings, I'd recommend starting with a simple: Here are my customized SillyTavern presets for Magnum. Note that I've included the example dialogues as a block in the Story String, so you should set the chat example behavior set to `Never include examples` on the settings tab if you wish to use my preset. Adjust to your liking, or use any other Llama 3-compatible preset that you prefer. Prefill (Last Assistant Prefix) can be modified to your liking. SillyTavern JSON - Magnum L3 Instruct No Names Prefill Compute paid for from the wallet of yours truly, Doctor Shotgun. Thank you to PocketDoc for the advanced prompt building strategy, as well as Delta-Vector and intervitens for helping experiment on it. Thank you to Gryphe for his advice on training rsLoRA from his experience training his own excellent models. Thank you to Sao10K for inspiring the Magnum series with his Euryale line of models. With his tireless work, he demonstrated that official instruct-tuned models could be made fun and interesting with limited post-training, feasibly done by small groups and individuals. Thank you to the members of Anthracite for the datasets and support. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice. The following hyperparameters were used during training: - learningrate: 4e-05 - trainbatchsize: 2 - evalbatchsize: 2 - seed: 42 - distributedtype: multi-GPU - numdevices: 8 - totaltrainbatchsize: 16 - totalevalbatchsize: 16 - optimizer: Use pagedademamix8bit and the args are: No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 40 - numepochs: 2.0

NaNK
llama
384
3

MS3.1-24B-Magnum-Diamond-GGUF

GGUF quantization of Doctor-Shotgun/MS3.1-24B-Magnum-Diamond using llama.cpp. Please refer to the linked model for full description. This model follows the Mistral v7 Tekken prompt format. Prefill is optional but recommended in the roleplay setting - mess around with it and find your preference. A typical input would look like this: Many inference libraries have the option to automatically prepend the BOS token ` `. For sampler settings, I'd recommend starting with a simple: Here are my customized SillyTavern presets for Magnum. Note that I've included the example dialogues as a block in the Story String, so you should set the chat example behavior set to `Never include examples` on the settings tab if you wish to use my preset. Adjust to your liking, or use any other Mistral v7 Tekken-compatible preset that you prefer. Prefill (Last Assistant Prefix) can be modified to your liking. SillyTavern JSON - Magnum Mistral v7 Tekken No Names SillyTavern JSON - Magnum Mistral v7 Tekken Prefill SillyTavern JSON - Magnum Mistral v7 Tekken No Names Prefill Thank you to kalomaze for providing the compute used for training. Thank you to ZeroAgency for the text-only model conversion. Thank you to PocketDoc for the advanced prompt building strategy. Thank you to Delta-Vector and intervitens for testing this on 12B. Thank you to Gryphe for his advice on training rsLoRA from his experience training his own excellent models. Thank you to Sao10K for inspiring the Magnum series with his Euryale line of models. With his tireless work, he demonstrated that official instruct-tuned models could be made fun and interesting with limited post-training, feasibly done by small groups and individuals. Thank you to the members of Anthracite for the datasets and support. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice.

NaNK
license:apache-2.0
332
4

L3.3-70B-Magnum-v4-SE

NaNK
llama
330
16

L3.3-70B-Magnum-Nexus

NaNK
llama
223
9

NoobAI XL Merges

Various merges built on Laxhar Lab's Illustrious-xl-based text to image model, uploaded for testing purposes. These are provided as-is, and YMMV. The user is responsible for any outputs produced using these checkpoints. Other models involved in these merges include: - comin/IterComp - CyberRealistic XL Perpendicular merges are done via sd-mecha using the Python API, for example: 1) Merge noobaiXLNAIXLvPred10Version-cyberrealistic4-perpendicular 2) Add vpred and ztsnr keys to the resulting model for autodetection in Comfy/Forge

195
16

ML2-123B-Magnum-Diamond

NaNK
64
8

Magnum-v4-SE-70B-LoRA

The Magnum v4 series is complete, but here's something a little extra I wanted to tack on as I wasn't entirely satisfied with the results of v4 72B. "SE" for Special Edition - this model is finetuned from meta-llama/Llama-3.3-70B-Instruct as an rsLoRA adapter. The dataset is a slightly revised variant of the v4 data with some elements of the v2 data re-introduced. The objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale, so don't be surprised to see "Claude-isms" in its output. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice. The following hyperparameters were used during training: - learningrate: 4e-05 - trainbatchsize: 1 - evalbatchsize: 1 - seed: 42 - distributedtype: multi-GPU - numdevices: 8 - totaltrainbatchsize: 8 - totalevalbatchsize: 8 - optimizer: Use pagedademamix8bit and the args are: No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 40 - numepochs: 2 - PEFT 0.14.0 - Transformers 4.47.1 - Pytorch 2.5.1+cu124 - Datasets 3.2.0 - Tokenizers 0.21.0

NaNK
llama
55
4

L3.3-70B-Magnum-v5-SFT-Alpha-GGUF

GGUF quantization of Doctor-Shotgun/L3.3-70B-Magnum-v5-SFT-Alpha using llama.cpp release b5415. Please refer to the linked model for full description. This model follows the Llama 3 prompt format. Prefill is recommended for this model while prepended names is optional - mess around with it and find your preference. A typical input would look like this: Many inference libraries have the option to automatically prepend the BOS token ` `. For sampler settings, I'd recommend starting with a simple: Here are my customized SillyTavern presets for Magnum. Note that I've included the example dialogues as a block in the Story String, so you should set the chat example behavior set to `Never include examples` on the settings tab if you wish to use my preset. Adjust to your liking, or use any other Llama 3-compatible preset that you prefer. Prefill (Last Assistant Prefix) can be modified to your liking. SillyTavern JSON - Magnum L3 Instruct No Names Prefill Training compute paid for from the wallet of yours truly, Doctor Shotgun. Additional compute for quantization provided by kalomaze. Thank you to PocketDoc for the advanced prompt building strategy. Thank you to Delta-Vector and intervitens for testing this on 12B. Thank you to Gryphe for his advice on training rsLoRA from his experience training his own excellent models. Thank you to Sao10K for inspiring the Magnum series with his Euryale line of models. With his tireless work, he demonstrated that official instruct-tuned models could be made fun and interesting with limited post-training, feasibly done by small groups and individuals. Thank you to the members of Anthracite for the datasets and support. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice.

NaNK
license:llama3.3
36
0

YuE-s1-7B-anneal-en-cot-exl2

NaNK
llama
34
12

Mistral-Large-3-675B-Instruct-2512-GGUF

NaNK
27
0

DeepSeek-V3.2-dense-attn-imatrix

25
0

GLM-4.5-Air-exl3_5.0bpw-h6

NaNK
exllamav3
23
2

MS3.1-24B-Magnum-Diamond

Magnum "Diamond" in reference to the intense heat and pressure (generated through matrix multiplications) needed to turn the coal-esque material of dry, assistant-tuned models into creative writing gems! This model is finetuned from a text-only conversion of mistralai/Mistral-Small-3.1-24B-Instruct-2503 as an rsLoRA adapter. It uses the same data mix as Doctor-Shotgun/L3.3-70B-Magnum-v5-SFT-Alpha, however with pre-tokenization and modifications to the custom loss masking. The goal was to re-create the model at a smaller, more consumer-friendly size. This model should perform competently with or without prepending character names, and with or without prefill. The objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale, so don't be surprised to see "Claude-isms" in its output. Here's the rsLoRA adapter for those merge-makers out there to play with. This model follows the Mistral v7 Tekken prompt format. Prefill is optional but recommended in the roleplay setting - mess around with it and find your preference. A typical input would look like this: Many inference libraries have the option to automatically prepend the BOS token ` `. For sampler settings, I'd recommend starting with a simple: Here are my customized SillyTavern presets for Magnum. Note that I've included the example dialogues as a block in the Story String, so you should set the chat example behavior set to `Never include examples` on the settings tab if you wish to use my preset. Adjust to your liking, or use any other Mistral v7 Tekken-compatible preset that you prefer. Prefill (Last Assistant Prefix) can be modified to your liking. SillyTavern JSON - Magnum Mistral v7 Tekken No Names SillyTavern JSON - Magnum Mistral v7 Tekken Prefill SillyTavern JSON - Magnum Mistral v7 Tekken No Names Prefill Thank you to kalomaze for providing the compute used for training. Thank you to ZeroAgency for the text-only model conversion. Thank you to PocketDoc for the advanced prompt building strategy, as well as Delta-Vector and intervitens for helping experiment on it. Thank you to Gryphe for his advice on training rsLoRA from his experience training his own excellent models. Thank you to Sao10K for inspiring the Magnum series with his Euryale line of models. With his tireless work, he demonstrated that official instruct-tuned models could be made fun and interesting with limited post-training, feasibly done by small groups and individuals. Thank you to the members of Anthracite for the datasets and support. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice. The following hyperparameters were used during training: - learningrate: 2e-05 - trainbatchsize: 1 - evalbatchsize: 1 - seed: 42 - distributedtype: multi-GPU - numdevices: 4 - gradientaccumulationsteps: 4 - totaltrainbatchsize: 16 - totalevalbatchsize: 4 - optimizer: Use pagedademamix8bit and the args are: No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 40 - numepochs: 2.0

NaNK
license:apache-2.0
19
5

YuE-s2-1B-general-exl2

m-a-p/YuE-s2-1B-general quantized with Exllamav2. It appears to remain coherent using the default calibration data without adding audio tokens. Intended to be used with the WIP exl2 inference repository for YuE.

NaNK
llama
17
7

TinyLlama-1.1B-32k

NaNK
llama
11
30

lzlv-limarpv3-l2-70b

NaNK
llama
9
11

mistral-v0.1-7b-pippa-metharme-lora

NaNK
license:apache-2.0
8
3

Qwen3-30B-A3B-Instruct-2507-ScatterMoE

Re-packed weights of Qwen/Qwen3-30B-A3B-Instruct-2507 using Charles Goddard's remote code implementation of scattermoe, including scripts to convert to and from standard `Qwen3MoeForCausalLM`. Thank you to intervitens for assistance with memory-efficient conversion scripts! This is intended to be used as a drop-in replacement for efficient training using any `transformers`-based training repository. Optional monkeypatches included for Liger Kernel and Cut Cross-Entropy. Simply rename the relevant modeling file to `modelingqwen3sharedmoe.py`.

NaNK
license:apache-2.0
8
1

Qwen3-Coder-30B-A3B-Instruct-ScatterMoE

NaNK
license:apache-2.0
8
1

Qwen3-30B-A3B-Thinking-2507-ScatterMoE

NaNK
license:apache-2.0
7
0

Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-DARE-TIES

NaNK
6
5

GLM-4.5-Air-exl3_3.14bpw-h6

NaNK
6
2

MS3.1-24B-Magnum-Diamond-LoRA

NaNK
license:apache-2.0
6
1

MS3.2-24B-Magnum-Diamond-LoRA

Magnum "Diamond" in reference to the intense heat and pressure (generated through matrix multiplications) needed to turn the coal-esque material of dry, assistant-tuned models into creative writing gems! This model is finetuned from a text-only conversion of mistralai/Mistral-Small-3.2-24B-Instruct-2506 as an rsLoRA adapter. It uses the same data mix as Doctor-Shotgun/L3.3-70B-Magnum-v5-SFT-Alpha, however with pre-tokenization and modifications to the custom loss masking. The goal was to re-create the model at a smaller, more consumer-friendly size. This model should perform competently with or without prepending character names, and with or without prefill. The objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale, so don't be surprised to see "Claude-isms" in its output. This is a minor version update over Doctor-Shotgun/MS3.1-24B-Magnum-Diamond-LoRA utilizing the new official instruct model from June 2025. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice. There was a weird loss spike of unclear significance on one sample that was not seen using the same dataset on Mistral Small 3.1 Instruct, but the resulting model appears to be sane. The following hyperparameters were used during training: - learningrate: 2e-05 - trainbatchsize: 1 - evalbatchsize: 1 - seed: 42 - gradientaccumulationsteps: 16 - totaltrainbatchsize: 16 - optimizer: Use pagedademamix8bit and the args are: No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 40 - numepochs: 2.0 - PEFT 0.15.2 - Transformers 4.51.3 - Pytorch 2.7.1+cu128 - Datasets 3.5.1 - Tokenizers 0.21.1

NaNK
license:apache-2.0
6
0

cat-v1.0-13b

NaNK
llama
5
26

Qwen3 235B A22B Instruct 2507 Exl3 3.07bpw H6 Custom

NaNK
exllamav3
5
5

Magnum-v5-70B-SFT-Alpha-LoRA

NaNK
llama
5
1

L3.3-70B-Magnum-Diamond-LoRA

Magnum "Diamond" in reference to the intense heat and pressure (generated through matrix multiplications) needed to turn the coal-esque material of dry, assistant-tuned models into creative writing gems! This model is finetuned from meta-llama/Llama-3.3-70B-Instruct as an rsLoRA adapter. It uses the same data mix as Doctor-Shotgun/L3.3-70B-Magnum-v5-SFT-Alpha, however with pre-tokenization and modifications to the custom loss masking. It's for all intents and purposes a version update to the former model. This model should perform competently with or without prepending character names, and with or without prefill. The objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale, so don't be surprised to see "Claude-isms" in its output. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice. The following hyperparameters were used during training: - learningrate: 4e-05 - trainbatchsize: 2 - evalbatchsize: 2 - seed: 42 - distributedtype: multi-GPU - numdevices: 8 - totaltrainbatchsize: 16 - totalevalbatchsize: 16 - optimizer: Use pagedademamix8bit and the args are: No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 40 - numepochs: 2.0 - PEFT 0.15.2 - Transformers 4.51.3 - Pytorch 2.6.0+cu124 - Datasets 3.5.1 - Tokenizers 0.21.1

NaNK
llama
5
1

smol_llama-220M-GQA-32k-theta-sft-limarp

llama
5
0

Qwen3-235B-A22B-Instruct-2507-ScatterMoE

Re-packed weights of Qwen/Qwen3-235B-A22B-Instruct-2507 using Charles Goddard's remote code implementation of scattermoe, including scripts to convert to and from standard `Qwen3MoeForCausalLM`. Thank you to intervitens for assistance with memory-efficient conversion scripts! This is intended to be used as a drop-in replacement for efficient training using any `transformers`-based training repository. Optional monkeypatches included for Liger Kernel and Cut Cross-Entropy. Simply rename the relevant modeling file to `modelingqwen3sharedmoe.py`.

NaNK
license:apache-2.0
5
0

Nous-Capybara-limarpv3-34B

NaNK
llama
4
34

Chronos-Hermes-v2-13b-Limarp-Lora-Merged

NaNK
llama
4
1

L3.3-70B-Magnum-Nexus-GGUF

GGUF quantization of Doctor-Shotgun/L3.3-70B-Magnum-Nexus using llama.cpp release b5359. Please refer to the linked model for full description. This model follows the Llama 3 prompt format. Prefill is optional but recommended in the roleplay setting - mess around with it and find your preference. A typical input would look like this: Many inference libraries have the option to automatically prepend the BOS token ` `. For sampler settings, I'd recommend starting with a simple: Here are my customized SillyTavern presets for Magnum. Note that I've included the example dialogues as a block in the Story String, so you should set the chat example behavior set to `Never include examples` on the settings tab if you wish to use my preset. Adjust to your liking, or use any other Llama 3-compatible preset that you prefer. Prefill (Last Assistant Prefix) can be modified to your liking. SillyTavern JSON - Magnum L3 Instruct No Names Prefill Compute for Doctor-Shotgun/L3.3-70B-Magnum-v4-SE and Doctor-Shotgun/L3.3-70B-Magnum-v5-SFT-Alpha funded by Doctor-Shotgun. Thank you to kalomaze for providing the compute used for Doctor-Shotgun/L3.3-70B-Magnum-v5-SFT-Gamma. Thank you to PocketDoc for the advanced prompt building strategy, as well as Delta-Vector and intervitens for helping experiment on it. Thank you to Gryphe for his advice on training rsLoRA from his experience training his own excellent models. Thank you to Sao10K for inspiring the Magnum series with his Euryale line of models. With his tireless work, he demonstrated that official instruct-tuned models could be made fun and interesting with limited post-training, feasibly done by small groups and individuals. Thank you to the members of Anthracite for the datasets and support. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice.

NaNK
license:llama3.3
4
0

L3.3-70B-Magnum-v4-SE-GGUF

GGUF quantization of Doctor-Shotgun/L3.3-70B-Magnum-v4-SE using llama.cpp release b5415. Please refer to the linked model for full description. This model follows the Llama 3 prompt format. Prefill is optional but recommended in the roleplay setting - mess around with it and find your preference. A typical input would look like this: Many inference libraries have the option to automatically prepend the BOS token ` `. For sampler settings, I'd recommend starting with a simple: Here are my customized SillyTavern presets for Magnum. Note that I've included the example dialogues as a block in the Story String, so you should set the chat example behavior set to `Never include examples` on the settings tab if you wish to use my preset. Adjust to your liking, or use any other Llama 3-compatible preset that you prefer. Prefill (Last Assistant Prefix) can be modified to your liking. SillyTavern JSON - Magnum L3 Instruct No Names Prefill Training compute paid for from the wallet of yours truly, Doctor Shotgun. Additional compute for quantization provided by kalomaze. Thank you to Gryphe for his advice on training rsLoRA from his experience training his own excellent models. Thank you to Sao10K for inspiring the Magnum series with his Euryale line of models. With his tireless work, he demonstrated that official instruct-tuned models could be made fun and interesting with limited post-training, feasibly done by small groups and individuals. Thank you to the members of Anthracite for the datasets and support. This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice.

NaNK
license:llama3.3
4
0

Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss

Experimental model, using a limarp qlora trained at 10k ctx length (greater than size of the longest limarp sample when tokenized via mistral's tokenizer) on mistralai/Mixtral-8x7B-v0.1 using Charles Goddard's ZLoss and Megablocks-based fork of transformers, and then fused to mistralai/Mixtral-8x7B-Instruct-v0.1 at 0.5 weight. And this seems to avoid the Mixtral looping pitfalls for me so far. Play around with it and see what works well for you. Exl2 Quants courtesy of LoneStriker: - 2.4bpw - 3.0bpw - 3.5bpw - 3.75bpw - 4.0bpw - 5.0bpw - 6.0bpw Usage: The intended prompt format is the Alpaca instruction format of LimaRP v3: My current templates have been uploaded to a folder. Message length control Due to the inclusion of LimaRP v3, it is possible to append a length modifier to the response instruction sequence, like this: This has an immediately noticeable effect on bot responses. The available lengths are: `micro, tiny, short, medium, long, massive, huge, enormous, humongous, unlimited`. The recommended starting length is `medium`. Keep in mind that the AI may ramble or impersonate the user with very long messages. Bias, Risks, and Limitations The model will show biases similar to those observed in niche roleplaying forums on the Internet, besides those exhibited by the base model. It is not intended for supplying factual information or advice in any form. Training Details This model is a merge. Please refer to the link repositories of the merged models for details.

NaNK
license:apache-2.0
3
18

airoboros-2.2.1-limarpv3-y34b

NaNK
llama
3
4

limarp-miqu-1-70b-qlora

NaNK
llama
3
4

smol_llama-220M-GQA-32k-theta-sft

llama
3
2

smol_llama-220M-GQA-32k-theta

llama
3
1

limarp-zloss-mixtral-8x7b-qlora

NaNK
license:apache-2.0
2
2

limarp-deepseek-67b-qlora

NaNK
llama
2
0

no-robots-y34b-lora

NaNK
llama
1
5

llama-2-13b-chat-limarp-v2-merged

NaNK
llama
1
4

ds-smol-brew-7b

NaNK
llama
1
4

Mixtral-8x7B-Instruct-v0.1-limarp

NaNK
license:apache-2.0
1
3

CalliopeDS-L2-13B-exl2

NaNK
llama
1
2

Nous-Hermes-Llama2-13b-Limarp-Lora-Merged

NaNK
llama
1
1

ds-brew-13b

NaNK
llama
1
1

ds-spicy-brew-13b

NaNK
llama
1
1

mistral-v0.1-supercot-lora

NaNK
license:mit
1
1

smol_llama-220M-GQA-32k-linear

llama
1
0

L3.3-70B-Magnum-v5-SFT-Alpha

NaNK
llama
1
0

GLM-4.5-Air-exl3_3.08bpw-h6

NaNK
1
0

ds-LoRA

0
26

limarpv3-yi-llama-34b-lora

NaNK
llama
0
10

Norobara-ZLoss-8x7B

NaNK
0
5

Chronohermes-Grad-L2-13b

NaNK
llama
0
4

Mixtral-8x7B-Instruct-v0.1-limarp-exl2

NaNK
license:apache-2.0
0
3

NoobAI-XL-Character-Lora

license:apache-2.0
0
3

Euryale-1.3-limarpv3-L2-70B

NaNK
llama
0
2

Euryale-1.3-limarpv3-L2-70B-exl2

NaNK
llama
0
2

lzlv-limarpv3-l2-70b-exl2

NaNK
llama
0
2

WinterGoddess-1.4x-limarpv3-70B-L2

NaNK
llama
0
2

SDXL-PD6-Character-Lora-1024

0
2

airoboros-limarpv3-l2-70b-gpt4-1.4.1-exl2

NaNK
llama
0
1

L3.3-70B-Magnum-v4-SE-Cirrus-x1-ModelStock

NaNK
llama
0
1