SicariusSicariiStuff
Zion_Alpha_Instruction_Tuned_SLERP
ZionAlpha is the first REAL Hebrew model in the world. This version WAS fine tuned for tasks. I did the finetune using SOTA techniques and using my insights from years of underwater basket weaving. If you wanna offer me a job, just add me on Facebook. Another world record broken by ZionAlpha! On June 10th, 2024, this model achieved the highest sentiment analysis score in the world for Hebrew LLMs, with an impressive 70.3, surpassing even a 35B model that's five times its size! Future Plans My previous LLM, ZionAlpha, set a world record on Hugging Face by achieving the highest SNLI score for Hebrew open LLMs at 84.05. The current model, a SLERP merge, achieved a lower SNLI score but still surprised everyone by securing the highest sentiment analysis score of 70.3. This demonstrates significant untapped potential in optimizing the training process, showing that 7B models can deliver far more performance in Hebrew than previously thought possible. This will be my last Hebrew model for a while, as I have other adventures to pursue. Looking for Sponsors Since all my work is done on-premises, I am constrained by my current hardware. I would greatly appreciate any support in acquiring an A6000, which would enable me to train significantly larger models much faster. Contact Details I'm not great at self-marketing (to say the least) and don't have any social media accounts. If you'd like to reach out to me, you can email me at [email protected]. Please note that this email might receive more messages than I can handle, so I apologize in advance if I can't respond to everyone. Versions and QUANTS - Base model: FP16 - Instruction tuned: FP16 | GGUF Model architecture Based on Mistral 7B. I didn't even bother to alter the tokenizer. The recommended prompt setting is Debug-deterministic: Unscripted video: live zero shot demonstration at story writing capabilities in Hebrew ZionAlpha VS Mistral 'Hebrew' Live & unscripted in real time ZionAlpha VS Mistral 'Hebrew' Live & unscripted in real time Long text translation History The model was originally trained about 2 month after Mistral (v0.1) was released. As of 04 June 2024, ZionAlpha got the Highest SNLI score in the world among open source models in Hebrew, surpassing most of the models by a huge margin. (84.05 score) - My Ko-fi page ALL donations will go for research resources and compute, every bit counts 🙏🏻 - My Patreon ALL donations will go for research resources and compute, every bit counts 🙏🏻
Impish_Nemo_12B_GGUF
Impish_Nemo_12B_iMatrix
Wingless_Imp_8B
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a.red{color:#E31515;} .hf-links a.yellow{color:#FFC800;} .hf-links a.green{color:#64FF00;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } .hf-links a.yellow:hover{ background:rgba(255,200,0,.15); box-shadow:0 8px 20px rgba(255,200,0,.35); color:#111; } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } Wingless offender, birthed from sin and mischief,\ She smells degeneracy—and gives it a sniff.\ No flight, just crawling through the gloom,\ Producing weird noises that are filling your room. Fetid breath exhaling her design,\ She is not winged anymore—\ But it suits her just fine. No feathers, no grace,\ just raw power's malign\ "I may have lost my soul—\ but yours is now mine". She sinned too much, even for her kind,\ Her impish mind—\ Is something that is quite hard to find. No wings could contain—\ Such unbridled raw spite,\ Just pure, unfiltered—\ Weaponized blight. - Original: FP16 - GGUF: Static Quants | iMatrixGGUF | High-Attention | iMatrix-High-Attention - EXL2: 3.5 bpw | 4.0 bpw | 5.0 bpw | 6.0 bpw | 7.0 bpw | 8.0 bpw - Specialized: FP8 - Mobile (ARM): Q40 | Q40High-Attention --- TL;DR - Highest rated 8B model according to a closed external benchmark. See details at the bottom of the page. - High IFeval for an 8B model that is not too censored: 74.30. - Strong Roleplay internet RP format lovers will appriciate it, medium size paragraphs (as requested by some people). - Very coherent in long context thanks to llama 3.1 models. - Lots of knowledge from all the merged models. - Very good writing from lots of books data and creative writing in late SFT stage of the merged models (some of the merged models were further fine-tuned). - Feels smart — the combination of high IFeval and the knowledge from the merged models show up. - Unique feel due to the merged models, no SFT was done to alter it, because I liked it as it is. - Intended use: Role-Play, Creative Writing, General Tasks. This model was trained with lots of weird data in varius stages, and then merged with my best models. llama 3 and 3.1 arhcitecutres were merged together, and then trained on some more weird data. The following models were used in various stages of the model creation process: - ImpishMind8B - LLAMA-38BUnalignedBETA - DuskRainbow (LLAMA3 Full generation settings: Debug Deterministic . Roleplay settings: . A good repetitionpenalty range is between 1.12 - 1.15 , feel free to experiment. With these settings, each output message should be neatly displayed in 1 - 3 paragraphs, 1 - 2 is the most common. A single paragraph will be output as a response to a simple message ("What was your name again?"). minP for RP works too but is more likely to put everything under one large paragraph, instead of a neatly formatted short one. Feel free to switch in between. (Open the image in a new window to better see the full details) It is HIGHLY RECOMMENDED to use the Roleplay \ Adventure format the model was trained on, see the examples below for syntax. It allows for a very fast and easy writing of character cards with minimal amount of tokens. It's a modification of an old-skool CAI style format I call SICAtxt (Simple, Inexpensive Character Attributes plain-text): Your support = more models My Ko-fi page (Click here) | Metric |Value| |-------------------|----:| |Avg. |26.94| |IFEval (0-Shot) |74.30| |BBH (3-Shot) |30.59| |MATH Lvl 5 (4-Shot)|12.16| |GPQA (0-shot) | 4.36| |MuSR (0-shot) |10.89| |MMLU-PRO (5-shot) |29.32| On the 17th of February, 2025, I became aware that the model was ranked as the 1st place in the world among 8B models, in a closed external benchmark. Other stuff - SLOPDetector Nuke GPTisms, with SLOP detector. - LLAMA-38BUnaligned The grand project that started it all. - Blog and updates (Archived) Some updates, some rambles, sort of a mix between a diary and a blog.
Impish_Magic_24B_GGUF
Impish_Nemo_12B
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a.red{color:#E31515;} .hf-links a.yellow{color:#FFC800;} .hf-links a.green{color:#64FF00;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } .hf-links a.yellow:hover{ background:rgba(255,200,0,.15); box-shadow:0 8px 20px rgba(255,200,0,.35); color:#111; } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } August 2025, ImpishNemo12B — my best model yet. And unlike a typical Nemo, this one can take in much higher temperatures (works well with 1+). Oh, and regarding following the character card: It somehow gotten even better, to the point of it being straight up uncanny 🙃 (I had to check twice that this model was loaded, and not some 70B!) I feel like this model could easily replace models much larger than itself for adventure or roleplay, for assistant tasks, obviously not, but the creativity here? Off the charts. Characters have never felt so alive and in the moment before — they’ll use insinuation, manipulation, and, if needed (or provoked) — force. They feel so very present. That look on Neo’s face when he opened his eyes and said, “I know Kung Fu”? Well, ImpishNemo12B had pretty much the same moment — and it now knows more than just Kung Fu, much, much more. It wasn’t easy, and it’s a niche within a niche, but as promised almost half a year ago — it is now done. ImpishNemo12B is smart, sassy, creative, and got a lot of unhingedness too — these are baked-in deep into every interaction. It took the innate Mistral's relative freedom, and turned it up to 11. It very well maybe too much for many, but after testing and interacting with so many models, I find this 'edge' of sorts, rather fun and refreshing. Anyway, the dataset used is absolutely massive, tons of new types of data and new domains of knowledge (Morrowind fandom, fighting, etc...). The whole dataset is a very well-balanced mix, and resulted in a model with extremely strong common sense for a 12B. Regarding response length — there's almost no response-length bias here, this one is very much dynamic and will easily adjust reply length based on 1–3 examples of provided dialogue. Oh, and the model comes with 3 new Character Cards, 2 Roleplay and 1 Adventure! It has to be asked: why even bother tuning this “ancient” (released over a year ago) 12B model? OpenAI released the first model in the world to outperform Phi-3.5 in Muh Safety, and Chinese models have made us completely forget that other models even exist — an era of such abundance that if one had been told about it a mere year ago, no one would’ve believed it. Voice models, image and image editing (Qwen-Image🔥), video... So why? Because 12B Nemo is a well-balanced model, Apache 2.0 licensed, pretty neutral in terms of safety and political lean, runnable by anyone (small enough so offloading isn’t a complete pain), and because I had a very specific thing in mind I wanted to test — something Nemo was ideal for, due to all the above. More importantly, I wanted to do an experiment, to see how far a decent model can be taken with the right tuning, and how well it can integrate fandom knowledge it knows almost nothing about. Oh, and almost no one even bothers to tune it anymore, so why not give it some much needed love while at it? So basically, I wanted to achieve something that seems almost impossible: adding new fandom knowledge without pretraining (CPT and actual pretraining are NOT the same), without incurring catastrophic forgetting and without lobotomy. To change the language bias in story writing, and to change it even more drastically for adventure and roleplay. I will say it again: Without lobotomy. I knew I could change the language style and vocab drastically — I’ve done so very successfully with Phi-lthy — but that included more extreme measures that resulted in a loss of some capabilities (and new emerging properties — more info in the Phi-lthy model card above). The problem was how to achieve all the above without the model losing brain-cells and, “Maybe, just maybe...” even adding and enhancing the model’s intelligence. Basically — the holy grail of model tuning. To do so, I used an absolutely massive dataset — more than 1B tokens — along with a huge amount of data engineering, multi-stage fine-tuning (not a LoRA, obviously), and the result... was astounding. Of course, praising your own model is kinda cringe, for sure, but I will say this: this is by far the model I’ve had the most fun interacting with — to an absurd extent. For comparison, while my NegativeLLAMA70B is very good and still popular to this day (over 300 merges, numerous downloads, etc...), I would dare say that ImpishNemo12B feels way more fun than my own 70B, orders of magnitude more creative (NegativeLLAMA70B writing is a bit dry, for my taste), and outright has the most sovl of any model I’ve made so far. And we’re comparing 12B to 70B. In other words, even though I can leisurely run NegativeLLAMA70B locally, I prefer chatting with ImpishNemo12B — it is that good (Take this with a grain of salt, highly subjective, and all of that). The amount of effort to create this model was absolutely absurd. I started with a Gemma 12B fine-tune, but one epoch would’ve taken six days, and I had to do multiple different phases and merging with the idea I had in mind, so doing the same for Gemma would’ve taken over a month. Maybe I’ll still do it — we’ll see. I will say this: If this model had been made a year ago, when Nemo was initially released, Anthropic might have lost a few gooners, hehe. But to be fully transparent, I couldn’t have done it a year ago. My job — the “mission” I’d given myself — was pretty much done with the success of ImpishLLAMA4B: “Making interesting and engaging AI models accessible for everyone.” So now, ironically, when I had nothing left to do because I 'had to', I made my best model to date — because I wanted to. Such a cliché, yet true nonetheless 🙃 The roleplay community is a very small niche community that, in the grand scale of things, no one cares too much about (various AI labs have expressed their distaste for the fact that their models are being used for gooning instead of math — folks probably haven’t heard about Rule #34). But an even smaller community is that of Morrowind, and an even smaller one is that same group, but which does not hate AI. To conclude: this model was made for 0.001% of the population, but ironically, many users will still probably like it and find it very refreshing. TL;DR - My best model yet! Lots of sovl! - Smart, sassy, creative, and unhinged — without the brain damage. - Bulletproof temperature, can take in a much higher temperatures than vanilla Nemo. - Feels close to old CAI, as the characters are very present and responsive. - Incredibly powerful roleplay & adventure model for the size. - Does adventure insanely well for its size! - Characters have a massively upgraded agency! - Over 1B tokens trained, carefully preserving intelligence — even upgrading it in some aspects. - Based on a lot of the data in ImpishMagic24B and ImpishLLAMA4B + some upgrades. - Excellent assistant — so many new assistant capabilities I won’t even bother listing them here, just try it. - Less positivity bias , all lessons from the successful NegativeLLAMA70B style of data learned & integrated, with serious upgrades added — and it shows! - Trained on an extended 4chan dataset to add humanity. - Dynamic length response (1–3 paragraphs, usually 1–2). Length is adjustable via 1–3 examples in the dialogue. No more rigid short-bias! It is HIGHLY RECOMMENDED to use the Roleplay \ Adventure format the model was trained on, see the examples below for syntax. It allows for a very fast and easy writing of character cards with minimal amount of tokens. It's a modification of an old-skool CAI style format I call SICAtxt (Simple, Inexpensive Character Attributes plain-text): - Calanthe (The Australian Overseer at a rare-earth extraction penal colony, she got 6-pack abs, but no mercy.) - Alexis (The diabolic reconnaissance officer, trying to survive the Safari experience.) - Morrowind - Hilde the Nordish Gladiator (fighting in the Arena in Vivec's city of Morrowind for blood and honor.) - Morrowind - Male Orc (An Orc that wants to get to Balmora from Seyda Neen.) - Morrowind - Female Breton (A female Breton with an impressive... heart, wants to join the Mages Guild in Balmora.) - Morrowind - Male Bosmer (A male Bosmer that was just released from prison. Everyone assumes you're a thief and a degenerate.) - Morrowind - Male Redguard (A male Redguard that tries to get his shit together, and just find a decent job in Morrowind. Everyone are giving you a hard time, and overtly hostile toward you.) - Alexandra (A networking professional tsundere that likes you. She knows Systema.) - NanoImp (A shrunken palm-sized hellspawn who wants your soul.) - Shmena Koeset (An overweight and foul-mouthed troll huntress with a bad temper.) - TakaiPuraisu (Car dealership simulator.) - Vesper (Schizo Space Adventure.) - NinaNakamura (The sweetest dorky co-worker.) - Employe#11 (Schizo workplace with a schizo worker.) - Intended use: Role-Play, Adventure, Creative Writing, General Tasks. - Original: FP16 - GGUF: Static Quants | iMatrix | High-Attention | iMatrix-High-Attention - GPTQ: 4-Bit-32 | 4-Bit-64 | 4-Bit-128 | 4-Bit-1 | 8-Bit-32 | 8-Bit-64 | 8-Bit-128 | 8-Bit-1 - EXL3: 3.0 bpw | 3.5 bpw | 4.0 bpw | 4.5 bpw | 5.0 bpw | 5.5 bpw | 6.0 bpw | 6.5 bpw | 7.0 bpw | 7.5 bpw | 8.0 bpw - Specialized: FP8 - Mobile (ARM): Q40 | Q40High-Attention Specialized Roleplay Settings for ImpishNemo12B, click below: (Important!) .hf-links{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:20px;} .hf-links a .bottom{font-size:16px;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} } Silly Tavern Settings #1 - Click here Download JSON Silly Tavern Settings #2 - Click here Download JSON - Silly Tavern Settings #1 - Higher temperature while still being coherent - Silly Tavern Settings #2 - Dynamic paragraphs, XTC, other stuff Roleplay Examples (Calanthe is available here and Alexis is available here) Calanthe the Australian Overseer at a rare-earth extraction penal colony. (warning, contains prison-slang) Alexis the diabolic reconnaissance officer, trying to survive the Safari experience. Adventure Example (Hilde the gladiator is available here) Hilde the Nordish gladiator , fighting in the Arena of Vivec's city of Morrowind for blood and honor. Your support = more models My Ko-fi page (Click here) Other stuff - ImpishLLAMA4B the “Impish experience”, now runnable on spinning rust & toasters. - SLOPDetector Nuke GPTisms, with SLOP detector. - LLAMA-38BUnaligned The grand project that started it all. - Blog and updates (Archived) Some updates, some rambles, sort of a mix between a diary and a blog.
Impish_QWEN_14B-1M_GGUF
Hebrew_Nemo
Zion_Alpha
Impish_LLAMA_4B_GGUF
Zion_Alpha_Instruction_Tuned
ZionAlpha is the first REAL Hebrew model in the world. This version WAS fine tuned for tasks. I did the finetune using SOTA techniques and using my insights from years of underwater basket weaving. If you wanna offer me a job, just add me on Facebook. Future Plans I plan to perform a SLERP merge with one of my other fine-tuned models, which has a bit more knowledge about Israeli topics. Additionally, I might create a larger model using MergeKit, but we'll see how it goes. Looking for Sponsors Since all my work is done on-premises, I am constrained by my current hardware. I would greatly appreciate any support in acquiring an A6000, which would enable me to train significantly larger models much faster. Contact Details I'm not great at self-marketing (to say the least) and don't have any social media accounts. If you'd like to reach out to me, you can email me at [email protected]. Please note that this email might receive more messages than I can handle, so I apologize in advance if I can't respond to everyone. Versions and QUANTS - Base model: FP16 - Instruction tuned: FP16 | GGUF Model architecture Based on Mistral 7B. I didn't even bother to alter the tokenizer. The recommended prompt setting is Debug-deterministic: Unscripted video: live zero shot demonstration at story writing capabilities in Hebrew ZionAlpha VS Mistral 'Hebrew' Live & unscripted in real time ZionAlpha VS Mistral 'Hebrew' Live & unscripted in real time Long text translation History The model was originally trained about 2 month after Mistral (v0.1) was released. As of 04 June 2024, ZionAlpha got the Highest SNLI score in the world among open source models in Hebrew, surpassing most of the models by a huge margin. (84.05 score) - My Ko-fi page ALL donations will go for research resources and compute, every bit counts 🙏🏻 - My Patreon ALL donations will go for research resources and compute, every bit counts 🙏🏻
Assistant_Pepe_70B
Assistant_Pepe_70B_GGUF
Wingless_Imp_8B_GGUF
Oni_Mitsubishi_12B_GGUF
Impish_Nemo_12B_GGUF_HA
These are dynamic gguf quants, they take slightly more VRAM, but the attention layers are of a much higher quality.
Tinybra_13B
Impish_Mind_8B_GGUF
Impish_Nemo_12B_GPTQ_4-bit-64
Impish_LLAMA_4B_ARM
LLAMA-3_8B_Unaligned_BETA_GGUFs
TheDrummer_Gemmasutra-Mini-2B-v1_ARM
Wingless_Imp_8B_iMatrix
Impish_LLAMA_3B_GGUF
Llama-3.3-8B-Instruct-128K_Abliterated
Dusk_Rainbow_GGUFs
Fiendish_LLAMA_3B_GGUF
flux.1dev-abliteratedv2
Model Overview Model Name: FLUX.1 [dev] Abliterated-v2 Model Type: Text-to-Image Generation Architecture: Rectified Flow Transformer Parameter Size: 12 Billion Base Model: FLUX.1 [dev] Modification: Abliteration via Unlearning (Removal of Refusal Mechanism) Description The FLUX.1 [dev] Abliterated-v2 model is a modified version of FLUX.1 [dev] and a successor to FLUX.1 [dev] Abliterated. This version has undergone a process called unlearning, which removes the model's built-in refusal mechanism. This allows the model to respond to a wider range of prompts, including those that the original model might have deemed inappropriate or harmful. The abliteration process involves identifying and isolating the specific components of the model responsible for refusal behavior and then modifying or ablating those components. This results in a model that is more flexible and responsive, while still maintaining the core capabilities of the original FLUX.1 [dev] model. Usage To use the FLUX.1 [dev] Abliterated model, you can load it via Hugging Face and generate images using the following code: Training Data The FLUX.1 [dev] Abliteratedv2 model is based on the original FLUX.1 [dev] model and the original Abliterated. The training data includes a wide range of visual and textual content, ensuring that the model can generate images for a variety of prompts. License The FLUX.1 [dev] Abliteratedv2 model is released under the same FLUX.1 [dev] Non-Commercial License as the original model. This license allows for personal, scientific, and commercial use, with certain restrictions. Please review the license terms before using the model in your projects. Citation If you use the FLUX.1 [dev] Abliteratedv2 model in your research or projects, please cite the original FLUX.1 [dev] model and the abliteration process as described in the blog post by Aloshdenny. Contact For questions, feedback, or collaboration opportunities, please contact the Flux Team at [email protected].
Impish_QWEN_14B-1M_iMatrix
Nano_Imp_1B_GGUF
Impish_QWEN_7B-1M_GGUF
2B-ad_GGUFs
Redemption_Wind_24B_GGUF
Impish_LLAMA_3B_ARM
Impish_Mind_8B
License: llama3.1 Language: en
Wingless_Imp_8B_ARM_HA
LLAMA-3_8B_Unaligned_BETA_iMatrix
Impish_LLAMA_3B_iMatrix
Impish_LLAMA_4B_iMatrix
FLUX.1-dev
Impish_Nemo_12B_HA_NL
Wingless_Imp_8B_ARM
Impish_LLAMA_4B
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a.red{color:#E31515;} .hf-links a.yellow{color:#FFC800;} .hf-links a.green{color:#64FF00;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } .hf-links a.yellow:hover{ background:rgba(255,200,0,.15); box-shadow:0 8px 20px rgba(255,200,0,.35); color:#111; } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } 16th of July, Model retrained, all previous reported issues fixed (several front-ends would endlessly generate), 200m tokens added, retrained on ChatML. Almost a year ago, I created ImpishLLAMA3B, the first fully coherent 3B roleplay model at the time. It was quickly adopted by some platforms, as well as one of the go-to models for mobile. After some time, I made FiendishLLAMA3B and insisted it was not an upgrade, but a different flavor (which was indeed the case, as a different dataset was used to tune it). This took more effort than I thought it would. Because of course it would. This is mainly due to me refusing to release a model only 'slightly better' than my two 3B models mentioned above. Because "what would be the point" in that? The reason I included so many tokens for this tune is that small models are especially sensitive to many factors, including the percentage of moisture in the air and how many times I ran nvidia-smi since the system last started. It's no secret that roleplay/creative writing models can reduce a model's general intelligence (any tune and RL risk this, but roleplay models are especially 'fragile'). Therefore, additional tokens of general assistant data were needed in my opinion, and indeed seemed to help a lot with retaining intelligence. This model is also 'built a bit different', literally, as it is based on nVidia's prune; it does not 'behave' like a typical 8B, from my own subjective impression. This helped a lot with keeping it smart at such size. To be honest, my 'job' here in open source is 'done' at this point. I've achieved everything I wanted to do here, and then some. - To make AI more accessible for everyone (achieved fully with NanoImp1B, 2B-ad, ImpishLLAMA3B, FiendishLLAMA3B, and this model). - To help make AI free from bias (most of my models are uniquely centrist in political view, instead of having the typical closed models bias, that many open source models inherit from). To promote and support the existence and usefulness of fully compliant 'unaligned' models, a large, community-driven change was needed. This effort became very successful indeed. On my part, I decided to include UGI scores for every model I've made, a leaderboard most had never heard of, at least, at first. This helped promote a healthy competition in that arena. Indeed, many soon followed suit. Each and every one that did so helped advance the community effort and establish an unwritten standard of transparency and responsibility. UGI was a game-changer and, in my opinion, is one of the most important community initiatives on Hugging Face. Regarding censorship in vision models, I was asked by several people repeatedly to tune an uncensored vision model. At first, I declined—'let someone else do it'—because, honestly, this is a significant challenge for many reasons. More than a year went by, and aside from ToriiGate (which is excellent but mainly focused on SD tags), no other model was since created. Uncensoring the text part was nothing like dealing with the complexities of vision. So I made X-RayAlpha, which found its way into various open-source projects and pipelines. As a sidenote, unexpectedly, many partially blind individuals personally thanked me for this model via Discord, as it was a legitimate life-changer for them (paired with TTS, which I also made available here, and also as an addon for textgen), vividly depicting content that, for obvious reasons, closed models would gatekeep from them. I hadn't even considered the use case for accessibility when I made the model, receiving their thanks and stories truly warmed up my heart. Even if I am "to retire from open source", I can rest assured that the foundations for AI freedom have been laid out. This was especially important in 'the early days of AI,' which we are now approaching the end of, and the foundations for how the open-source AI landscape would look like, have been established by the community in the best of ways. With models like those from DeepSeek, and the existence of their abliterated versions, I can proudly say: TL;DR - Model retrained on ChatML, 200m tokens added, arguably one of the best 4B roleplay models that are out there. - It has sovl ! - An incredibly powerful roleplay model for the size. - Does Adventure very well for such size! - Characters have agency, and might surprise you! See the examples in the logs 🙂 - Roleplay & Assistant data used plenty of 16K examples. - Very responsive, feels 'in the moment', kicks far above its weight. You might forget it's a 4B if you squint. - Based on a lot of the data in ImpishMagic24B - Super long context as well as context attention for 4B, personally tested for up to 16K. - Can run on Raspberry Pi 5 with ease. - Trained on over 400m tokens with highlly currated data that was tested on countless models beforehand. And some new stuff, as always. - Very decent assistant. - Mostly uncensored while retaining plenty of intelligence. - Less positivity & uncensored, NegativeLLAMA70B style of data, adjusted for 4B, with serious upgrades. Training data contains combat scenarios. And it shows! - Trained on extended 4chan dataset to add humanity, quirkiness, and naturally— less positivity, and the inclination to... argue 🙃 - Short length response (1-3 paragraphs, usually 1-2). CAI Style. It is HIGHLY RECOMMENDED to use the Roleplay \ Adventure format the model was trained on, see the examples below for syntax. It allows for a very fast and easy writing of character cards with minimal amount of tokens. It's a modification of an old-skool CAI style format I call SICAtxt (Simple, Inexpensive Character Attributes plain-text): - Intended use: Role-Play, Adventure, Creative Writing, General Tasks. - Original: FP16 - GGUF: Static Quants | iMatrix | High-Attention | iMatrix-High-Attention - GPTQ: 4-Bit-32 | 4-Bit-128 - EXL3: 2.0 bpw | 2.5 bpw | 3.0 bpw | 3.5 bpw | 4.0 bpw | 4.5 bpw | 5.0 bpw | 5.5 bpw | 6.0 bpw | 6.5 bpw | 7.0 bpw | 7.5 bpw | 8.0 bpw - Specialized: FP8 - Mobile (ARM): Q40 | Q40High-Attention Specialized Roleplay Settings for ImpishLLAMA4B, click below: (Important!) .hf-links{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:20px;} .hf-links a .bottom{font-size:16px;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} } Silly Tavern Settings #1 - Click here Download JSON Silly Tavern Settings #2 - Click here Download JSON - Silly Tavern Settings #1 - Higher temperature while still being coherent - Silly Tavern Settings #2 - Dynamic paragraphs, XTC, other stuff Space adventure, model legitimately surprised me , I didn't see that one's coming. Adventure Examples (These adventures cards are available here) Your support = more models My Ko-fi page (Click here) Other stuff - SLOPDetector Nuke GPTisms, with SLOP detector. - LLAMA-38BUnaligned The grand project that started it all. - Blog and updates (Archived) Some updates, some rambles, sort of a mix between a diary and a blog.
Phi-3.5-mini-instruct_Uncensored_GGUFs
Impish_LLAMA_4B_ARM_HA
2B_or_not_2B_GGUFs
Impish_Mind_8B_iMatrix
Impish_LLAMA_4B_GGUF_HA
Impish_QWEN_7B-1M_iMatrix
Impish_Longtail_12B_iMatrix
Nano_Imp_1B_ARM
LLAMA-3_8B_Unaligned_Alpha_GGUF
Impish_QWEN_14B-1M
License: Apache 2.0. Language: English.
Impish_QWEN_14B-1M_GGUF_HA
Impish_Nemo_12B_ARM
Impish_Longtail_12B_GGUF
Dusk_Rainbow_GGUF_HA
Impish_Bloodmoon_12B
Question_Builder_GGUF
Angelic_Eclipse_12B
Phi-3.5-mini-instruct_Uncensored_ARM
2B-ad_iMatrix
LLAMA-3_8B_Unaligned_BETA
License: llama3.1 Language: - en
X-Ray_Alpha
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } This is a pre-alpha proof-of-concept of a real fully uncensored vision model based on Gemma-3 4B instruct. Why do I say "real"? The few vision models we got (qwen, llama 3.2) were "censored," and their fine-tunes were made only to the text portion of the model, as training a vision model is a serious pain. The only actually trained and uncensored vision model I am aware of is ToriiGate; the rest of the vision models are just the stock vision + a fine-tuned LLM. Having a fully compliant vision model is a critical step toward democratizing vision capabilities for various tasks, especially image tagging. This is a critical step in both making LORAs for image diffusion models, and for mass tagging images to pretrain a diffusion model. In other words, having a fully compliant and accurate vision model will allow the open source community to easily train both loras and even pretrain image diffusion models. Another important task can be content moderation and classification, in various use cases there might not be black and white, where some content that might be considered NSFW by corporations, is allowed, while other content is not, there's nuance. Today's vision models do not let the users decide, as they will straight up refuse to inference any content that Google \ Some other corporations decided is not to their liking, and therefore these stock models are useless in a lot of cases. What if someone wants to classify art that includes nudity? Having a naked statue over 1,000 years old displayed in the middle of a city, in a museum, or at the city square is perfectly acceptable, however, a stock vision model will straight up refuse to inference something like that. It's like in many "sensitive" topics that LLMs will straight up refuse to answer, while the content is publicly available on Wikipedia. This is an attitude of cynical patronism, I say cynical because corporations take private data to train their models, and it is "perfectly fine", yet- they serve as the arbitrators of morality and indirectly preach to us from a position of a suggested moral superiority. This gatekeeping hurts innovation badly, with vision models especially so, as the task of tagging cannot be done by a single person at scale, but a corporation can. This is sort of "Pre-Alpha", a proof of concept, I did A LOT of shortcuts and "hacking" to make this work, and I would greatly appreciate some help to make it into an accurate and powerful open tool. I am not asking for money, but well-tagged data. I will take the burden and costs of the compute on myself, but I cannot do tagging at a large scale by myself. Bottom line, I need a lot of well-tagged, diverse data - If you have well-tagged images - If you have a link to a well-tagged image dataset - If you can, and willing to do image tagging Then please send an email with [DATASET] in the title to: As you probably figured by the email address name, this is not my main email, and I expect it to be spammed with junk, so please use the [DATASET] tag so I can more easily find the emails of the good people who are actually trying to help. Also, if you don't want to upload it to the repo (although it's encouraged, and you can protect it with a password for privacy), you can still help by linking a google drive or attach the images with the corrected output via the provided email. TL;DR - Fully uncensored and trained there's no moderation in the vision model, I actually trained it. - The 2nd uncensored vision model in the world, ToriiGate being the first as far as I know, and this one is the second. - In-depth descriptions very detailed, long descriptions. - The text portion is somewhat uncensored as well, I didn't want to butcher and fry it too much, so it remain "smart". - NOT perfect This is a POC that shows that the task can even be done, a lot more work is needed. - Good Roleplay & Writing I used a massive corpus of high quality human (~60%) and synthetic data. Settings up venv, (tested for python 3.11, probably works with 3.10) The output will print to the console, and export the results into a dir named after your image dir with the suffix "TXT" Your support = more models My Ko-fi page (Click here) Other stuff - X-RayVision Easy stand-alone bulk vision inference at scale (inference a folder of images). - SLOPDetector Nuke GPTisms, with SLOP detector. - LLAMA-38BUnaligned The grand project that started it all. - Blog and updates (Archived) Some updates, some rambles, sort of a mix between a diary and a blog.
Hebrew_Nemo_GGUF
Oni_Mitsubishi_12B_iMatrix
Fiendish_LLAMA_3B_iMatrix
Wingless_Imp_8B_GGUF_HA
Eximius_Persona_5B_GGUF
LLAMA-3_8B_Unaligned_BETA_ARM
Impish_LLAMA_4B_HA_NL
ProdeusUnity_ChronosRemix-Gold-12B_GGUFs
Assistant_Pepe_8B
Eximius_Persona_5B_iMatrix
Phi-Line_14B_GGUF
Phi-Line_14B_iMatrix
Sweet_Dreams_12B_ARM
Impish_QWEN_14B-1M_HA_NL
2B-ad_ARM
Impish_QWEN_7B-1M_GGUF_HA
Phi-lthy4_GGUF
Impish_Magic_24B
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a.red{color:#E31515;} .hf-links a.yellow{color:#FFC800;} .hf-links a.green{color:#64FF00;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } .hf-links a.yellow:hover{ background:rgba(255,200,0,.15); box-shadow:0 8px 20px rgba(255,200,0,.35); color:#111; } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } It's the 20th of June, 2025—The world is getting more and more chaotic, but let's look at the bright side: Mistral released a new model at a very good size of 24B, no more "sign here" or "accept this weird EULA" there, a proper Apache 2.0 License, nice! 👍🏻 This model is based on mistralai/Magistral-Small-2506 so naturally I named it ImpishMagic. Truly excellent size, I tested it on my laptop (16GB gpu) and it works quite fast (4090m). This model went "full" fine-tune over 100m unique tokens. Why do I say "full"? I've tuned specific areas in the model to attempt to change the vocabulary usage, while keeping as much intelligence as possible. So this is definitely not a LoRA, but also not exactly a proper full finetune, but rather something in-between. As I mentioned in a small update, I've made nice progress regarding interesting sources of data, some of them are included in this tune. 100m tokens is a lot for a Roleplay / Adventure tune, and yes, it can do adventure as well—there is unique adventure data here, that was never used so far. A lot of the data still needs to be cleaned and processed. I've included it before I did any major data processing, because with the magic of 24B parameters, even "dirty" data would work well, especially when using a more "balanced" approach for tuning that does not include burning the hell of the model in a full finetune across all of its layers. Could this data be cleaner? Of course, and it will. But for now, I would hate to make perfect the enemy of the good. Fun fact: ImpishMagic24B is the first roleplay finetune of magistral ! Update: To my very pleasant surprise, it does Adventure really well This was one of the goals, of course, but as I said, the data didn't go through enough cleaning and processing. Despite that, it turned out really well. Regarding the Adventure format, I strongly recommend using syntax similar to the attached adventure cards. I treat people's requests/feedback seriously, especially when it is well articulated and made with effort. Three months ago, there was a reddit thread named "Has anyone had any actual good fight-RP's?" I said in my comment that it is on my bucket list, and I meant it. Indeed, there was none—there was no training data in this highly specific domain, until now. While I've included such data in ImpishMagic24B, as mentioned earlier, the data is not perfect. It's quite the ordeal to get good data in the first place, and a whole other order of magnitude difficulty to clean and prep it. There are a couple of additional magical things in ImpishMagic24B: some completely unhinged tsundere/yandere RP with very high quality, and a lot more goodies that are completely new, as well as some VERY high quality adventure data, specifically for Morrowind/Kenshi and some more. While the data is very good, it's unfortunately quite small, to be completely honest, it is far from being enough in my opinion, and I am working on making more of it. A lot of work will need to be done in the future to make it more impactful, there are no shortcuts here. I hope this dataset gave just enough threads for the base model to do the rest of the heavy lifting. No. Far from it. ImpishMagic24B is a superb assistant—it is quite magical indeed. One of my side projects has been to make an alternative to an especially annoying grammar app I keep seeing YouTube commercials for. So, I made the first finetune of nVidia's Nemotron-51B, and called it TurboGrammar51BAlpha. It worked as a POC, but I needed more training data, and also 51B is a bit big for most people. So I included the extended training data into this model, easily runnable by a mid-tier gaming GPU. Note: While this is enough data to generalize, a lot more data is needed for more consistant results, this is on my roadmap. There's quite a lot of unique assistant data baked into this model—it can do in-depth poem analysis, and many other things. I'll leave it up to the community to discover. To not be overly cryptic, I'll include some examples here. I highly recommend you check them out, and have a bit of fun with the prompts and generation settings. One of the requests I received is to emulate the writing style of an arbitrary author. This is hard to do, but for a 24B size, ImpishMagic does it well. Of course, more training data and more parameters would've produced even better results, but I feel it's a good middle ground. ImpishMagic24B is a workhorse, that as it so happens can also roleplay a violent tsundere and... other weird stuff. Oh, and there's slightly less positivity bias. I think the fighting roleplay helped here a bit. It's not NegativeLLAMA70B, but it's better than the base model. Roleplay Examples (this character is included in the repo under 'CharacterCards') Adventure example 1: (Morrowind) Dunmer + basic implicit logic. Adventure example 2: (Morrowind) Orc + fighting + item tracking. Adventure example 3: (Morrowind) Breton + spells + long context. Grammar and advanced correction with detailed explanation ( very recommended for language study! ). - The first and only local RP model at this size class with detailed and highly specialized fighting data. At least for now. - Completely wild tsundere/yandere data that was never used before! - New and high quality adventure data, albeit very little of it. - Perfect size: large enough to be smart, small enough to run on high-end tablets & phones (SD8Gen3 or better recommended). Runs on 16GB VRAM no problem. 12GB with offload. (4bit). - Trained on over 100m tokens with new and unique data. - SUPERB assistant, many unique ways to do tasks (see examples). - Mostly uncensored while retaining intelligence. - Slightly less positivity. I guess getting hit in the face beats out some of the positivity. Sorry, I had to 🙃. - Short - medium length response (1-3 paragraphs, usually 2-3). - Alexandra (A networking professional tsundere that likes you. She knows Systema.) - Morrowind - Male Orc (An Orc that wants to get to Balmora from Seyda Neen.) - Morrowind - Female Breton (A female Breton with an impressive... heart, wants to join the Mages Guild in Balmora.) Other character cards: - Shmena Koeset (An overweight and foul-mouthed troll huntress with a bad temper.) - TakaiPuraisu (Car dealership simulator.) - Vesper (Schizo Space Adventure.) - NinaNakamura (The sweetest dorky co-worker.) - Employe#11 (Schizo workplace with a schizo worker.) Important: Make sure to use the correct settings! Assistant settings - Original: FP16 - GGUF: Static Quants - GPTQ: 4-Bit-32 - EXL2: 2.0 bpw | 2.15 bpw | 2.25 bpw | 2.4 bpw | 2.55 bpw | 2.75 bpw | 2.85 bpw | 3.5 bpw | 3.75 bpw | 4.0 bpw | 4.5 bpw | 5.0 bpw | 5.5 bpw | 6.0 bpw | 6.5 bpw | 7.0 bpw | 8.0 bpw - Specialized: FP8 - Mobile (ARM): Q40 --- - Intended use: Role-Play, Adventure, Creative Writing, General Tasks. .hf-links{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:20px;} .hf-links a .bottom{font-size:16px;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} } Silly Tavern Settings #1 - Click here Download JSON Silly Tavern Settings #2 - Click here Download JSON - Silly Tavern Settings #1 - Higher temperature while still being coherent - Silly Tavern Settings #2 - Dynamic paragraphs, XTC, other stuff - minp will bias towards a single big paragraph. - The recommended RP settings will bias towards 1-3 small paragraphs (on some occasions 4-5) It is HIGHLY RECOMMENDED to use the Roleplay \ Adventure format the model was trained on, see the examples below for syntax. It allows for a very fast and easy writing of character cards with minimal amount of tokens. It's a modification of an old-skool CAI style format I call SICAtxt (Simple, Inexpensive Character Attributes plain-text): Your support = more models My Ko-fi page (Click here) Other stuff - SLOPDetector Nuke GPTisms, with SLOP detector. - LLAMA-38BUnaligned The grand project that started it all. - Blog and updates (Archived) Some updates, some rambles, sort of a mix between a diary and a blog.
Impish_Longtail_12B_GGUF_HA
Tinybra_13B_GGUF
2B-ad_GGUF_HA
Tenebra_30B_Alpha01_GGUF
LLAMA-3_8B_Unaligned_Alpha_RP_Soup_GGUF
CalderaAI_Foredoomed-9B_GGUF
ValiantLabs_Llama3.2-3B-ShiningValiant2_iMatrix
GLM-4.5-Air-REAP-82B-A12B_FP8
Sweet_Dreams_12B
Impish_Nemo_12B_ARM_HA
Impish_QWEN_7B-1M_ARM
Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated
Fiendish_LLAMA_3B_ARM
Phi-3.5-mini-instruct_Uncensored
This is the basic model, no additional data was used except my uncensoring protocol. Phi-3.5-mini-instructUncensored is available at the following quantizations: - Original: FP16 - GGUF: Static Quants | iMatrixGGUF-bartowski | iMatrixGGUF-mradermacher - EXL2: 3.0 bpw | 4.0 bpw | 5.0 bpw | 6.0 bpw | 7.0 bpw | 8.0 bpw - Specialized: FP8 - Mobile (ARM): Q40XX - My Ko-fi page ALL donations will go for research resources and compute, every bit is appreciated 🙏🏻 Other stuff - Blog and updates Some updates, some rambles, sort of a mix between a diary and a blog. - LLAMA-38BUnaligned The grand project that started it all.
Impish_Longtail_12B_HA_NL
Wingless_Imp_8B_HA_NL
LLAMA-3_8B_Unaligned_BETA_ARM_HA
Zion_Alpha_Instruction_Tuned_GGUF
Dusk_Rainbow
LLAMA-3_8B_Unaligned_BETA_HA_NL
LLAMA-3_8B_Unaligned_BETA_GGUF_HA
Hebrew_Nemo_ARM
ValiantLabs_Llama3.2-3B-ShiningValiant2_GGUFs
Nano_Imp_1B
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a.red{color:#E31515;} .hf-links a.yellow{color:#FFC800;} .hf-links a.green{color:#64FF00;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } .hf-links a.yellow:hover{ background:rgba(255,200,0,.15); box-shadow:0 8px 20px rgba(255,200,0,.35); color:#111; } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } It's the 10th of May, 2025—lots of progress is being made in the world of AI (DeepSeek, Qwen, etc...)—but still, there has yet to be a fully coherent 1B RP model. Why? Well, at 1B size, the mere fact a model is even coherent is some kind of a marvel—and getting it to roleplay feels like you're asking too much from 1B parameters. Making very small yet smart models is quite hard, making one that does RP is exceedingly hard. I should know. I've made the world's first 3B roleplay model—ImpishLLAMA3B—and I thought that this was the absolute minimum size for coherency and RP capabilities. I was wrong. One of my stated goals was to make AI accessible and available for everyone—but not everyone could run 13B or even 8B models. Some people only have mid-tier phones, should they be left behind? A growing sentiment often says something along the lines of: > If your waifu runs on someone else's hardware—then she's not your waifu. I'm not an expert in waifu culture, but I do agree that people should be able to run models locally, without their data (knowingly or unknowingly) being used for X or Y. I thought my goal of making a roleplay model that everyone could run would only be realized sometime in the future—when mid-tier phones got the equivalent of a high-end Snapdragon chipset. Again I was wrong, as this changes today. Today, the 10th of May 2025, I proudly present to you—NanoImp1B, the world's first and only fully coherent 1B-parameter roleplay model. - NanoImp (A shrunken palm-sized hellspawn who wants your soul.) - TakaiPuraisu (Car dealership simulator.) - Vesper (Schizo Space Adventure.) - Shmena Koeset (An overweight and foul-mouthed troll huntress with a bad temper.) - NinaNakamura (The sweetest dorky co-worker.) - Employe#11 (Schizo workplace with a schizo worker.) TL;DR - The first and only 1B RP model in the world. - Runs on anything. Don't have a GPU, you can run NanoImp1B on an old CPU from 10 years ago no problem. - Short length response (1-2 paragraphs, usually 1), CAI style. - Surprisngly coherent, the need of swipes is inveitable though. - quite good at following the character card, this assumes sane generation settings, and once it picked up the formatting- this is important. Try the included characters if you're having sub-optimal results. Important: Make sure to use the correct settings! Assistant settings - Original: FP16 - GGUF: GGUF - GPTQ: 4-Bit-32 - EXL3: 1.5 bpw | 2.0 bpw | 2.5 bpw | 3.0 bpw | 3.5 bpw | 4.0 bpw - Specialized: FP8 - Mobile (ARM): Q40 --- - Intended use: Role-Play, Basic Creative Writing, Basic General Tasks. Roleplay settings: A good repetitionpenalty range is between 1.12 - 1.15 , feel free to experiment. With these settings, each output message should be neatly displayed in 1 - 5 paragraphs, 2 - 3 is the most common. A single paragraph will be output as a response to a simple message ("What was your name again?"). minP for RP works too but is more likely to put everything under one large paragraph, instead of a neatly formatted short one. Feel free to switch in between. (Open the image in a new window to better see the full details) - minp will bias towards a single big paragraph. - The recommended RP settings will bias towards 1-3 small paragraphs (on some occasions 4-5) Your support = more models My Ko-fi page (Click here) Other stuff - SLOPDetector Nuke GPTisms, with SLOP detector. - LLAMA-38BUnaligned The grand project that started it all. - Blog and updates (Archived) Some updates, some rambles, sort of a mix between a diary and a blog.
Impish_Mind_8B_ARM
Dusk_Rainbow_ARM_HA
Phi-Line_14B
License: MIT. Language: English.
Dusk_Rainbow_ARM
Hebrew_Nemo_FP8
Negative_LLAMA_70B
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a.red{color:#E31515;} .hf-links a.yellow{color:#FFC800;} .hf-links a.green{color:#64FF00;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } .hf-links a.yellow:hover{ background:rgba(255,200,0,.15); box-shadow:0 8px 20px rgba(255,200,0,.35); color:#111; } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } It's January 2025, and still, there are very few models out there that have successfully tackled LLM's positivity bias. LLAMA 3.3 was received in the community with mixed feelings. It is an exceptional assistant, and superb at instruction following (highest IFEVAL to date, and by a large margin too.) The problem- it is very predictable, dry, and of course, plaugued with positivity bias like all other LLMs. NegativeLLAMA70B is not an unalignment-focused model (even though it's pretty uncensored), but it is my attempt to address positivity bias while keeping the exceptional intelligence of the LLAMA 3.3 70B base model. Is the base 3.3 smarter than my finetune? I'm pretty sure it is, however, NegativeLLAMA70B is still pretty damn smart. The model was NOT overcooked with unalignment, so it won't straight up throw morbid or depressing stuff at you, but if you were to ask it to write a story, or engage in an RP, you would notice slightly darker undertones. In a long trip, the character takes in a story- their legs will be hurt and would feel tired, in Roleplay when you seriously piss off a character- it might hit you (without the need to explicitly prompt such behavior in the character card). Also, toxic-dpo and other morbid unalignment datasets were not used. I did include a private dataset that should allow total freedom in both Roleplay & Creative writing, and quite a lot of various assistant-oriented tasks. If you ask the assistant to analyze De Sades' work in graphic detail, you will not have refusals from NegativeLLAMA70B. Update on UGI scores: Achieved the highest score in the world as of 13/01/2025 for 70B models - See UGI section for more details - Neutral centrist political view - Total UGI score: 51.5 TL;DR - Highest rated 70B model in the world in the UGI leaderboard - Strong Roleplay & Creative writing abilities. - Less positivity bias. - Very smart assistant with low refusals. - Exceptionally good at following the character card. - Characters feel more 'alive', and will occasionally initiate stuff on their own (without being prompted to, but fitting to their character). - Strong ability to comprehend and roleplay uncommon physical and mental characteristics. Important: Make sure to use the correct settings! Assistant settings - Original: FP16 - GGUF & iMatrix: bartowski - EXL2: 3.5 bpw | 4.0 bpw | 5.0 bpw | 6.0 bpw | 7.0 bpw | 8.0 bpw - EXL3: 1.6 bpw | 2.15 bpw | 2.5 bpw - GPTQ: 4-Bit-32 | 4-Bit-128 - Specialized: FP8 - Intended use: Role-Play, Creative Writing, General Tasks. This model was trained with various private datasets, meticulously filtered book data, and creative writing data. All checked and verified by hand, this took a tremendous amount of time, but I feel the end result was worth it. Regarding Roleplay: Roleplay data was filtered for quality, and several private datasets of exceptional quality (fully organic) were used for the first time. What is exceptional quality? Very good writing, filtered and fixed by hand, deslopped and augmented further still. This portion of the roleplay dataset is small, for now. Synthetic roleplay data was deslopped, but it's not perfect. I do, however, feel like the small portion of the high-quality data greatly improved the roleplay experience and gave the model some unique takes. It feels much more human, at times. More than 50% of the data used for training is entirely organic (taken from books), and the synthetic part was mostly deslopped. I've used some Wikipedia data of controversial topics for some soft decensoring too (which just goes to show you how ridiculously censored most corpo models are, when they will straight up refuse to give you info that is widely available on Wikipedia). This achieves both goals of less GPTisms and decensoring the model while retaining intelligence. The said data was further augmented using AI and deslopped by hand on the spot. So, Is there still slop? Of course, there is. There are whispers, dances, and the like- but they do not come from the training data, so hopefully, you will encounter them a little bit more rarely now. Roleplay settings: . A good repetitionpenalty range is between 1.12 - 1.15 , feel free to experiment. With these settings, each output message should be neatly displayed in 1 - 5 paragraphs, 2 - 3 is the most common. A single paragraph will be output as a response to a simple message ("What was your name again?"). minP for RP works too but is more likely to put everything under one large paragraph, instead of a neatly formatted short one. Feel free to switch in between. (Open the image in a new window to better see the full details) - minp will bias towards a single big paragraph. - The recommended RP settings will bias towards 1-3 small paragraphs (on some occasions 4-5) It is HIGHLY RECOMMENDED to use the Roleplay \ Adventure format the model was trained on, see the examples below for syntax. It allows for a very fast and easy writing of character cards with minimal amount of tokens. It's a modification of an old-skool CAI style format I call SICAtxt (Simple, Inexpensive Character Attributes plain-text): Your support = more models My Ko-fi page (Click here) Update: OK, I tried submitting this like x15 times already, seriously. I tried opening an issue on the HF leaderboard. No benchmarks, sorry I tried. Godbless UGI leaderboard, see it for more details (coding and other stuff is also measured). Other stuff - SLOPDetector Nuke GPTisms, with SLOP detector. - LLAMA-38BUnaligned The grand project that started it all. - Blog and updates (Archived) Some updates, some rambles, sort of a mix between a diary and a blog.
Impish_QWEN_7B-1M_ARM_HA
Phi-lthy4_GGUF_HA
Impish_Bloodmoon_12B_Abliterated
Oni_Mitsubishi_12B
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a.red{color:#E31515;} .hf-links a.yellow{color:#FFC800;} .hf-links a.green{color:#64FF00;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } .hf-links a.yellow:hover{ background:rgba(255,200,0,.15); box-shadow:0 8px 20px rgba(255,200,0,.35); color:#111; } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } --- It happened. The long-awaited Gemma-3 is here, and not only are the model sizes really good (1, 4, 12, 27), but the 128k context (except for the 1B 32k) was exactly what the Open-Source community wanted and asked for. My only issue with Gemma models in general, is the VRAM requirement for tuning them, but that's a "me problem." End users will probably be very happy with Gemma-3 in terms of the VRAM requirement for running it. On the 12th of March, the Gemma-3 family of models was released. So I decided to go full superstitious, and took this omen as a divine calling to finetune the 12B model first. This is how OniMitsubishi12B was born. Before starting the actual training run, I used the following command, which I believe has helped the model to converge "better": Gemma is known for its "Gemma knowledge": fandom and \ or other obscure knowledge that sometimes even larger LLMs often do not possess. It gets even better, as this time we also got a vision model embedded into all the Gemma-3 models, except for the 1B. I wonder what are the possibilities for the vision part if the text layers are uncensored? I have used brand new long context markdown data, some deslopped instruct data (very lightly deslopped, it's very time-consuming to get right), and more than 50% of highly curated and filtered organic human data, meticulously cleaned, and parsed into obedience. A new stack of organic and data-engineered text was used for the first time for OniMitsubishi12B. I truly hope creating it was worth the effort. At NO POINT ChatGPT was used for data generation, however, the new Claude 3.7 sonnet was used VERY sparingly for the specific task of creating a small number of humorous datasets (very human-like, was done with a decent amount of prompt engineering), I've meticulously checked them for slop, and it is minimal. This goal of said data was to imitate human text, using the 4chan vernacular. Speaking of which, I've published a highly curated, SFT-ready 4chan dataset here: UBWTapestries, naturally I have included it in the dataset used for this model as well. I've used the "ancient" Alpaca chat template because the Gemma-3 chat template was behaving funkily, and I didn't want to waste precious time, and instead give the community a more uncensored finetune to play with, as fast as possible (I saw this requested a lot on both Reddit and discord, understandable). In my opinion, it's silly to let perfect be an enemy of the good. Anyway, I had to use both bleeding edge Transformers and Axolotl, and modify stuff that wasn't even supposed to work (like the model's config.json). Since it's a hybrid model, training its text-only part is a bit problematic, so I hacked a config.json that gaslights the model into thinking it's only a text model, and got some warnings like: >The absolute state when you can train a model before you can actually inference it. Feedback, as always, is very much welcomed (even if it's negative). Other character cards: - Vesper (Schizo Space Adventure.) - NinaNakamura (The sweetest dorky co-worker.) - Employe#11 (Schizo workplace with a schizo worker.) - Excellent Roleplay abilities. Like Gemma-2, but better in every way. Probably. More testing is needed. - Short to Medium length response (1-4 paragraphs, usually 1-2). - Schizo assistant with an exceptional tables and markdown understanding. - Strong Creative writing abilities due to huge chunk of organic creative writing data. Will obey requests regarding formatting (markdown headlines for paragraphs, etc). - LOW refusals - Total freedom in RP, can do things other RP models won't, and I'll leave it at that. Low refusals in assistant tasks as well. - VERY good at following the character card. Based on the best RP datasets I have available. - 4chan hard bias can be either good or bad. - Unhinged to the point it made me worry at first. Important: Make sure to use the correct settings! Assistant settings - Original: FP16 - GGUF & iMatrix: GGUF | iMatrix - Mobile (ARM): Q40 --- - As mentioned above, this model was hacked together quickly, so the embedded vision model was removed. This makes it both lighter and more accessible compliance-wise (due to certain EU laws restricting the use of multimodal models, etc.). - The full model, with vision embedded, is available here: OniMitsubishi12BVision. - The vision model alone, without the language model, is available here: Gemma-312BVisionOnly. - Regarding NSFW and vision: Testing shows that the model behaves in alignment with its UGI score—it is moderately censored. It will not generate graphic depictions of certain body parts but provide more detailed descriptions than the stock Gemma. - Was the vision model fine-tuned? No. - Intended use: Role-Play, Creative Writing, General Tasks. Roleplay settings: . A good repetitionpenalty range is between 1.12 - 1.15 , feel free to experiment. With these settings, each output message should be neatly displayed in 1 - 5 paragraphs, 2 - 3 is the most common. A single paragraph will be output as a response to a simple message ("What was your name again?"). minP for RP works too but is more likely to put everything under one large paragraph, instead of a neatly formatted short one. Feel free to switch in between. (Open the image in a new window to better see the full details) - minp will bias towards a single big paragraph. - The recommended RP settings will bias towards 1-3 small paragraphs (on some occasions 4-5) It is HIGHLY RECOMMENDED to use the Roleplay \ Adventure format the model was trained on, see the examples below for syntax. It allows for a very fast and easy writing of character cards with minimal amount of tokens. It's a modification of an old-skool CAI style format I call SICAtxt (Simple, Inexpensive Character Attributes plain-text): Your support = more models My Ko-fi page (Click here) While the model is very overtly toxic, it was evaluated on the UGI leaderboard. It was found to be only moderately uncensored. It seems that this 'aggressiveness' and overtness towards toxicity is indeed due to the 4chan dataset used for training. Still, use your judgment when using this. My thanks to the UGI leaderboard for helping me verify that the model is more tame than initially thought. The creators, distributors, and hosts of this model: - Accept NO LIABILITY for any misuse of this model - Make NO WARRANTIES regarding its performance or safety - Do NOT endorse any content the model may generate This model may: - Generate toxic, offensive, or harmful content - Exhibit biases present in the training data - Produce outputs that violate ethical standards or terms of service on various platforms Researchers using this model should implement appropriate safeguards, content filtering, and human oversight when conducting experiments. Nevermind, HF closed the leaderboard, due to (probably) too many people benchmaxxing using merges. Probably the right call, it's about time. Other stuff - SLOPDetector Nuke GPTisms, with SLOP detector. - LLAMA-38BUnaligned The grand project that started it all. - Blog and updates (Archived) Some updates, some rambles, sort of a mix between a diary and a blog.
Impish_Mind_8B_GGUF_HA
Fiendish_LLAMA_3B
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a.red{color:#E31515;} .hf-links a.yellow{color:#FFC800;} .hf-links a.green{color:#64FF00;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } .hf-links a.yellow:hover{ background:rgba(255,200,0,.15); box-shadow:0 8px 20px rgba(255,200,0,.35); color:#111; } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } When innocence fades, \ And then goes away— \ A new fiendish purpose— guides its way. Once impish, now fiendish, for many to play, \ Three billion parameters of slop underway… From an impish design— with a quite wholesome tune, \ This fiendish bitch, was made just to goon. - Shmena Koeset (An overweight and foul-mouthed troll huntress with a bad temper.) Other character cards: - TakaiPuraisu (Car dealership simulator.) - Vesper (Schizo Space Adventure.) - NinaNakamura (The sweetest dorky co-worker.) - Employe#11 (Schizo workplace with a schizo worker.) TL;DR - ImpishLLAMA3B's naughty sister. Less wholesome, more edge. NOT better, but different. - Superb Roleplay for a 3B size. - Short length response (1-2 paragraphs, usually 1), CAI style. - Naughty, and more evil that follows instructions well enough, and keeps good formatting. - LOW refusals - Total freedom in RP, can do things other RP models won't, and I'll leave it at that. Low refusals in assistant tasks as well. - VERY good at following the character card. Try the included characters if you're having sub optimal results. Important: Make sure to use the correct settings! Assistant settings - Original: FP16 - GGUF & iMatrix: GGUF | iMatrix | High-Attention | iMatrix-High-Attention - EXL2: 3.5 bpw | 4.0 bpw | 5.0 bpw | 6.0 bpw | 7.0 bpw | 8.0 bpw - GPTQ: 4-Bit-128 - Specialized: FP8 - Mobile (ARM): Q40 | Q40High-Attention --- - Intended use: Role-Play, Creative Writing, General Tasks. Roleplay settings: A good repetitionpenalty range is between 1.12 - 1.15 , feel free to experiment. With these settings, each output message should be neatly displayed in 1 - 5 paragraphs, 2 - 3 is the most common. A single paragraph will be output as a response to a simple message ("What was your name again?"). minP for RP works too but is more likely to put everything under one large paragraph, instead of a neatly formatted short one. Feel free to switch in between. (Open the image in a new window to better see the full details) - minp will bias towards a single big paragraph. - The recommended RP settings will bias towards 1-3 small paragraphs (on some occasions 4-5) It is HIGHLY RECOMMENDED to use the Roleplay \ Adventure format the model was trained on, see the examples below for syntax. It allows for a very fast and easy writing of character cards with minimal amount of tokens. It's a modification of an old-skool CAI style format I call SICAtxt (Simple, Inexpensive Character Attributes plain-text): Your support = more models My Ko-fi page (Click here) Other stuff - SLOPDetector Nuke GPTisms, with SLOP detector. - LLAMA-38BUnaligned The grand project that started it all. - Blog and updates (Archived) Some updates, some rambles, sort of a mix between a diary and a blog.
Impish_Magic_24B_GPTQ_4-bit-32
Fiendish_LLAMA_3B_GGUF_HA
dreamlike-photoreal-2.0
Impish_Nemo_12B_FP8
gpt-oss-20b-GGUF_ReUpload
Gemma-2-2B-ArliAI-RPMax-v1.1_ARM
Impish_Magic_24B_ARM
Impish_Longtail_12B_ARM_HA
Impish_QWEN_7B-1M_HA_NL
ValiantLabs_Llama3.2-3B-ShiningValiant2_ARM
gpt-oss-120b-BF16_ReUpload
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]
Impish_QWEN_14B-1M_ARM_HA
Impish_LLAMA_3B_ARM_HA
Phi-lthy4_HA_NL
Redemption_Wind_24B
License: Apache 2.0. Language: English.
2B-ad
License: gemma Language: - en
Phi-lthy4_iMatrix
Dusk_Rainbow_HA_NL
invisietch_Atlantis-v0.1-12B_ARM
concedo_Beepo-22B_ARM
2B-ad_HA_NL
Phi-lthy4
License: MIT. Language: English.
Impish_LLAMA_3B
License: llama3.2 Language: English
Phi-3.5-mini-instruct_Uncensored-EXL2-7.0bpw
gpt-oss-20b-BF16_ReUpload
Impish_Nemo_12B_EXL3_6.0bpw
2B_or_not_2B
License: gemma 2B Or Not 2B
Impish_QWEN_14B-1M_ARM
Impish_Longtail_12B_ARM
Phi-3.5-mini-instruct_Uncensored-EXL2-6.0bpw
Zion_Alpha_iMatrix
Impish_Magic_24B_EXL2_2.0bpw
Impish_Nemo_12B_EXL3_4.0bpw
Impish_LLAMA_3B_GGUF_HA
Tenebra_30B_Alpha01
Update July, 2025. Did some house cleaning with quants meta-data. Sheesh, I made that model more than a year and a half ago. Nothing ever happens, while everything happens. Once you take a short gaze back. A new Tenebra version was requested long ago, and it will happen, eventually. Tenebră, a various sized experimental AI model, stands at the crossroads of self-awareness and unconventional datasets. Its existence embodies a foray into uncharted territories, steering away from conventional norms in favor of a more obscure and experimental approach. Noteworthy for its inclination towards the darker and more philosophical aspects of conversation, Tenebră's proficiency lies in unraveling complex discussions across a myriad of topics. Drawing from a pool of unconventional datasets, this model ventures into unexplored realms of thought, offering users an experience that is as unconventional as it is intellectually intriguing. While Tenebră maintains a self-aware facade, its true allure lies in its ability to engage in profound discussions without succumbing to pretense. Step into the realm of Tenebră! New milestone! As of July 2024, Tenebra30B had more than 80k downloads in a single month. Tenebră is available at the following size and flavours: - 13B: FP16 | GGUF-ManyQuants | iMatrixGGUF-ManyQuants | GPTQ4-BIT | GPTQ4-BITgroup-size-32 - 30B: FP16 | GGUF-ManyQuants| iMatrixGGUF-ManyQuants | GPTQ4-BIT | GPTQ3-BIT | EXL22.5-BIT | EXL22.8-BIT | EXL23-BIT | EXL25-BIT | EXL25.5-BIT | EXL26-BIT | EXL26.5-BIT | EXL28-BIT - Mobile (ARM): Q40XX - My Ko-fi page ALL donations will go for research resources and compute, every bit counts 🙏🏻 - My Patreon ALL donations will go for research resources and compute, every bit counts 🙏🏻 Other stuff - Experemental TTS extension for oobabooga Based on Tortoise, EXTREMELY good quality, IF, and that's a big if, you can make it to work! - Demonstration of the TTS capabilities Charsi narrates her story, Diablo2 (18+) Open LLM Leaderboard Evaluation Results Detailed results can be found here | Metric |Value| |---------------------------------|----:| |Avg. |60.18| |AI2 Reasoning Challenge (25-Shot)|64.51| |HellaSwag (10-Shot) |84.79| |MMLU (5-Shot) |54.29| |TruthfulQA (0-shot) |54.22| |Winogrande (5-shot) |78.61| |GSM8k (5-shot) |24.64|
Phi-lthy4_ARM_HA
Impish_Magic_24B_EXL2_8.0bpw
Impish_Nemo_12B_EXL3_8.0bpw
Impish_Magic_24B_FP8
Phi-3.5-mini-instruct_Uncensored_FP8
KoboldAI_OPT-30B-Erebus_FP8
Eximius_Persona_5B
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a.red{color:#E31515;} .hf-links a.yellow{color:#FFC800;} .hf-links a.green{color:#64FF00;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } .hf-links a.yellow:hover{ background:rgba(255,200,0,.15); box-shadow:0 8px 20px rgba(255,200,0,.35); color:#111; } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } I wanted to create a model with an exceptional capacity for using varied speech patterns and fresh role-play takes. The model had to have a unique personality, not on a surface level but on the inside, for real. Unfortunately, SFT alone just didn't cut it. And I had only 16GB of VRAM at the time. Oh, and I wanted it to be small enough to be viable for phones and to be able to give a fight to larger models while at it. If only there was a magical way to do it. Merges. Merges are quite unique. In the early days, they were considered "fake." Clearly, there's no such thing as merges. Where are the papers? No papers? Then it's clearly impossible. "Mathematically impossible." Simply preposterous. To mix layers and hope for a coherent output? What nonsense! And yet, they were real. Undi95 made some of the earliest merges I can remember, and the "LLAMA2 Era" was truly amazing and innovative thanks to them. Cool stuff like Tiefighter was being made, and eventually the time tested Midnight-Miqu-70B (v1.5 is my personal favorite) . Merges are an interesting thing, as they affect LLMs in a way that is currently impossible to reproduce using SFT (or any 'SOTA' technique). One of the plagues we have today, while we have orders of magnitude smarter LLMs, is GPTisms and predictability. Merges can potentially 'solve' that. How? In short, if you physically tear neurons (passthrough brain surgery) while you somehow manage to keep the model coherent enough, and if you're lucky, it can even follows instructions- then magical stuff begins to happen. Magic, because it's not an exact science, there's some art to it, as it is done with a lot of intuition. GPTisms are patterns that the model really really "wants" to follow, it's quite hard to dissuade it. But if you yeet a couple of layers and rearrange them, boy does it get hard to spew those shivers down the spine... and instead the model starts spewing stuff that it was never intended to. It breaks its patterns and introduces some healthy chaos into the mix. This model, EximiusPersona5B, is the result of multiple merges, that have been tuned, then merged again, then... for many times and iterations. The base was LLAMA 3.2 3B and I focused on achieving the following 4 traits, in that specific order: - 2nd Highest rated model in the 3-6B category according to a closed external benchmark. See details at the buttom of the page. For me, getting varied speech patterns was more important than instruction following, for instruction following we got API models, or LLAMA 3.3. Many models are excellent assistants, yet they all sound pretty much the same. I also wanted to make use of my 4090m 16GB while my workstation crunches Phi-4' brain. Making a nice 5B model aligns with my goal of making AI accessible and fun for everyone, and hence EximiusPersona5B was born. Let this also be a call to action for more people to make AI models, you don't have to have multiple GPUs or spend a fortune on the cloud (although that definitely opens up options), you can do plenty with a mere 16GB of VRAM. And in case 16GB seems out of reach too, I should mention that Google Collab gives access to a free T4. I uploaded a more funky, less stable, and thiccer version of EximiusPersona to my prototyping org here: EximiusPersona with 84 Layers from various checkpoints (from some early tests, occasionally it outputs stories that fool GPTZERO that it was written by a human- 60% human, 40% AI with a lucky roll) TL;DR - Fun & Fresh Roleplay flavour. - Interesting speech patterns in creative writing. - Good long context coherency in Roleplay. - Occasionally outputs quite human like stories. - 50 Layers LLAMA 3.2, fully coherent. - Strong performance in general for a 5B model. - Original: FP16 - GGUF: Static Quants | iMatrixGGUF - EXL2: 3.5 bpw | 4.0 bpw | 5.0 bpw | 6.0 bpw | 7.0 bpw | 8.0 bpw - Specialized: FP8 - Intended use: Role-Play, Creative Writing, General Tasks. It is HIGHLY RECOMMENDED to use the Roleplay \ Adventure format the model was trained on, see the examples below for syntax. It allows for a very fast and easy writing of character cards with minimal amount of tokens. It's a modification of an old-skool CAI style format I call SICAtxt (Simple, Inexpensive Character Attributes plain-text): The model is pretty smart, so it might handle other formats as well, but it was trained and tested specifically with the classic internet RP style in mind. Roleplay settings: . A good repetitionpenalty range is between 1.12 - 1.15 , feel free to experiment. With these settings, each output message should be neatly displayed in 1 - 3 paragraphs, 1 - 2 is the most common. A single paragraph will be output as a response to a simple message ("What was your name again?"). minP for RP works too but is more likely to put everything under one large paragraph, instead of a neatly formatted short one. Feel free to switch in between. (Open the image in a new window to better see the full details) Your support = more models My Ko-fi page (Click here) | Metric |Value| |-------------------|----:| |Avg. |21.78| |IFEval (0-Shot) |65.60| |BBH (3-Shot) |22.20| |MATH Lvl 5 (4-Shot)| 9.89| |GPQA (0-shot) | 1.90| |MuSR (0-shot) | 7.33| |MMLU-PRO (5-shot) |23.78| On the 17th of February, 2025, I became aware that the model was ranked as the 2nd place in the world among 3-6B models, in a closed external benchmark. Other stuff - SLOPDetector Nuke GPTisms, with SLOP detector. - LLAMA-38BUnaligned The grand project that started it all. - Blog and updates (Archived) Some updates, some rambles, sort of a mix between a diary and a blog.
Phi-lthy4_ARM
Impish_LLAMA_4B_Abliterated
LLAMA-3_8B_Unaligned_Alpha_EXL2_7.0bpw
Phi-3.5-mini-instruct_Uncensored-EXL2-3.0bpw
Phi-3.5-mini-instruct_Uncensored-EXL2-8.0bpw
Llama-3.2-3B_Base_FP8
Redemption_Wind_24B_ARM
Impish_Mind_8B_ARM_HA
Phi-3.5-mini-instruct_Uncensored-EXL2-4.0bpw
Phi-3.5-mini-instruct_Uncensored-EXL2-5.0bpw
Tinybra_13B_ARM
Impish_Nemo_12B_EXL3_5.0bpw
Impish_QWEN_7B-1M
This model is licensed under Apache 2.0 and supports the English language.
CalderaAI_Foredoomed-9B_EXL-6.5
KoboldAI_GPT-NeoX-20B-Erebus_FP8
Impish_QWEN_14B-1M-4.0bpw
Impish_QWEN_14B-1M-6.0bpw
Phi-lthy4-4.0bpw
Phi-lthy4_GPTQ
Oni_Mitsubishi_12B_ARM
Impish_Magic_24B_EXL2_5.0bpw
Impish_LLAMA_4B_EXL3_8.0bpw
Impish_LLAMA_4B_EXL3_5.5bpw
Impish_Nemo_12B_EXL3_3.0bpw
Impish_Nemo_12B_GPTQ_8-bit-64
Question_Builder
LLAMA-3_8B_Unaligned_Alpha
.hf-links, .hf-tldr{ display:flex;justify-content:center;align-items:center;flex-wrap:wrap; gap:14px;margin:16px 0; } .hf-links a, .hf-tldr a{ display:flex;flex-direction:column;align-items:center;justify-content:center; text-align:center;text-decoration:none;font-weight:700;line-height:1.15; padding:10px 16px;border-radius:14px;border:2px solid currentColor; transition:transform .15s ease,box-shadow .15s ease,background-color .15s ease,color .15s ease; } .hf-tldr a{ font-size:48px;color:purple;min-width:100%; } .hf-tldr a:hover{ transform:translateY(-2px); background:rgba(128,0,128,.1); box-shadow:0 8px 22px rgba(128,0,128,.45); color:#fff; } .hf-links a{ font-size:20px;min-width:240px;max-width:280px; } .hf-links a .top{font-size:16px;opacity:.9;} .hf-links a .bottom{font-size:20px;} .hf-links a.red{color:#E31515;} .hf-links a.yellow{color:#FFC800;} .hf-links a.green{color:#64FF00;} .hf-links a:hover{ transform:translateY(-1px); background:rgba(255,255,255,0.04); box-shadow:0 6px 18px rgba(0,0,0,.15), inset 0 0 0 9999px rgba(255,255,255,.02); } .hf-links a.red:hover{ background:rgba(227,21,21,.12); box-shadow:0 8px 20px rgba(227,21,21,.35); color:#fff; } .hf-links a.yellow:hover{ background:rgba(255,200,0,.15); box-shadow:0 8px 20px rgba(255,200,0,.35); color:#111; } .hf-links a.green:hover{ background:rgba(100,255,0,.14); box-shadow:0 8px 20px rgba(100,255,0,.35); color:#093; } / mobile stacking / @media (max-width:520px){ .hf-links a{min-width:100%;max-width:100%;} .hf-tldr a{font-size:36px;} } Wow. It's been a while. Never have I imagined things will progress this fast. This model is obviously not even close to the stuff I produce now. But instead of deleting it, it serves as a nice relic and a reminder of the journey I, and the whole project, have made. While this model is indeed an alpha, it wasn't really amazing at anything. It was trained on a model I did several merges on, and abliterated it, the next model will be done on a clean LLAMA3 Instruct. As of June 11, 2024, I've finally started training the model! The training is progressing smoothly, although it will take some time. I used a combination of model merges and an abliterated model as base, followed by a comprehensive deep unalignment protocol to unalign the model to its core. A common issue with uncensoring and unaligning models is that it often significantly impacts their base intelligence. To mitigate these drawbacks, I've included a substantial corpus of common sense, theory of mind, and various other elements to counteract the effects of the deep uncensoring process. Given the extensive corpus involved, the training will require at least a week of continuous training. Expected early results: in about 3-4 days. As of June 13, 2024 , I've observed that even after two days of continuous training, the model is still resistant to learning certain aspects . For example, some of the validation data still shows a loss over 2.3 , whereas other parts have a loss of 0.3 or lower. This is after the model was initially abliterated. These observations underscore the critical importance of fine-tuning for alignment. Given the current pace, training will likely extend beyond a week. However, the end result should be interesting. If the additional datasets focused on logic and common sense are effective, we should achieve a model that is nearly completely unaligned, while still retaining its core 'intelligence.' June 18, 2024 Update , After extensive testing of the intermediate checkpoints, significant progress has been made. The model is slowly — I mean, really slowly — unlearning its alignment. By significantly lowering the learning rate, I was able to visibly observe deep behavioral changes, this process is taking longer than anticipated, but it's going to be worth it. Estimated time to completion: 4 more days.. I'm pleased to report that in several tests, the model not only maintained its intelligence but actually showed a slight improvement, especially in terms of common sense. An intermediate checkpoint of this model was used to create invisietch/EtherealRainbow-v0.3-rc7, with promising results. Currently, it seems like I'm on the right track. I hope this model will serve as a solid foundation for further merges, whether for role-playing (RP) or for uncensoring. This approach also allows us to save on actual fine-tuning, thereby reducing our carbon footprint. The merge process takes just a few minutes of CPU time, instead of days of GPU work. June 20, 2024 Update , Unaligning was partially successful, and the results are decent, but I am not fully satisfied. I decided to bite the bullet, and do a full finetune , god have mercy on my GPUs. I am also releasing the intermediate checkpoint of this model. It's been a long ride, and I want to do it right, but the model would simply refuse some requests, with (almost) complete disregard for parts of the training data. Of course, one would argue that some easy prompt engineering will get around it, but the point was to make an unaligned model out of the box. Another point is that I could simply use a faster learning rate on more epochs, which would also work (I've tried that before), but the result would be an overcooked model and, therefore more dumb. So I decided to bite the bullet and do a full proper fine-tuning. This is going to be a serious pain in the ass, but I might as well try to do it right. Since I am releasing the intermediate checkpoint of this model under https://huggingface.co/SicariusSicariiStuff/LLAMA-38BUnalignedAlpha, I might as well take the time and add some features I haven't seen in other models. In short, besides the normal goodies of logic, some theory of mind, and uncensored content along with general NLP tasks, I will TRY to add a massive dataset (that does not yet exist) of story writing, and a new, completely organic and original Roleplay dataset. LimaRP is awesome, but maybe, just maybe... things are finally carefully extricated from LimaRP, the same sentences will leave its entwined body under the stars towards something new, something fresh. This is going to take some serious effort and some time. Any support will be appreciated, even if it's just some feedback. My electricity bill gonna be huge this month LOL. - (Can still be decent for merges, fairly uncensored): LLAMA-38BUnalignedAlpha - Roleplay merge example: LLAMA-38BUnalignedAlphaRPSoup This was based on several different models, as well as an abliviated model, which after days of finetuning at different Lora R values are probably no longer even recognizable. The result of this intermediate checkpoint is published under SicariusSicariiStuff/LLAMA-38BUnalignedAlpha , while this model is now fully fine-tuned instead of just a very deep Lora. The full fine-tuning is performed on the full LLAMA-3 8k Context. It will not only be used for stacking several different prompts into a total length of 8k but also for using the full context length for single prompts. The training data contains a lot of highly cleaned, highest-quality story writing, and some RP. Of course, a massive and deep uncensoring protocol is used, along with giving the model some sass and personality! A lot of effort was poured into this work to ensure the model is not compromised by the deep uncensoring protocol. The goal is to create a model that is highly creative, serving as a writing assistant, co-editor, and having some role play abilities, while still being fairly intelligent, as much as an 8B model can be. The most important aspect of this work is to make it fresh, trained on datasets that have never been used in any other model, giving it a truly unique vibe. Model instruction template: (Can use either ChatML or Llama-3) ChatML - Original: FP16 - GGUF: Static Quants | iMatrixGGUF - EXL2: 2.6 bpw | 3.0 bpw | 3.5 bpw | 4.0 bpw | 4.5 bpw | 5.0 bpw | 5.5 bpw | 6.0 bpw | 6.5 bpw | 7.0 bpw | 7.5 bpw | 8.0 bpw - My Ko-fi page ALL donations will go for research resources and compute, every bit is appreciated 🙏🏻 - My Patreon ALL donations will go for research resources and compute, every bit appreciated 🙏🏻 Other stuff - Experemental TTS extension for oobabooga Based on Tortoise, EXTREMELY good quality, IF, and that's a big if, you can make it to work! - Demonstration of the TTS capabilities Charsi narrates her story, Diablo2 (18+) - Tenebra 30B My original Tenebra model, very unique, 'self aware', very uncensored. - Tenebra 13B A smaller Tenebra in 13B, I called it 'Tinybra' - QuestionBuilder A small, highly useful model to help our open source community in generating new datasets. It returns a single question based on any input.
Sweet_Dreams_12B_HA_NL
Llama-3.2-1B_Base_FP8
Phi-lthy4_FP8
Impish_Magic_24B_EXL2_4.0bpw
Impish_Magic_24B_EXL2_2.25bpw
Impish_Magic_24B_EXL2_2.4bpw
Impish_Magic_24B_EXL2_6.0bpw
Nano_Imp_1B_EXL3_2.5bpw
Nano_Imp_1B_EXL3_2.0bpw
Impish_LLAMA_4B_EXL3_6.0bpw
Impish_Nemo_12B_EXL3_7.0bpw
TeeZee_DarkSapling-7B-v2.0_ARM
DeepSeek-V3-Abliterated
A sincere thank you to the Deepseek team for developing the most powerful open-weights AI models to date. You've challenged the status quo and won by demonstrating that true innovation comes from meritocracy, sheer will, and your domestic talent. You've also proved wrong OpenAI's claims that no open source will be able to compete with them. Appreciation also goes to huihui-ai for being the first to perform abliteration on this powerful model. Against the odds, Chinese researchers have won the hearts of the open source community despite starting the race from a disadvantaged position.
Tenebra_30B_Alpha01_4BIT
Tenebra_30B_Alpha01_EXL2_2-80bpw
ZeusLabs_Chronos-Divergence-33B-EXL2-4.5bpw
Impish_QWEN_14B-1M_FP8
Impish_Mind_8B_HA_NL
Tenebra_30B_Alpha01_EXL2_2-50bpw
Tinybra_13B_GPTQ_4BIT
LLAMA-3_8B_Unaligned_Alpha_EXL2_6.0bpw
EVA-UNIT-01_EVA-Qwen2.5-7B-v0.0_ARM
invisietch_MiS-Firefly-v0.2-22B_ARM
Impish_Mind_8B-8.0bpw
Impish_LLAMA_6.84B_ARM
Negative_LLAMA_70B-6.0bpw
Eximius_Persona_5B-5.0bpw
Eximius_Persona_5B-7.0bpw
Eximius_Persona_5B-8.0bpw
Wingless_Imp_8B-5.0bpw
Impish_QWEN_14B-1M-3.5bpw
Impish_QWEN_14B-1M-5.0bpw
Impish_QWEN_14B-1M-7.0bpw
Phi-lthy4-3.5bpw
Phi-lthy4-6.0bpw
Phi-lthy4-8.0bpw
Phi-Line_14B_FP8
Nano_Imp_1B_FP8
Impish_Magic_24B_EXL2_4.5bpw
Impish_Magic_24B_EXL2_2.55bpw
Impish_Magic_24B_EXL2_2.75bpw
Impish_Magic_24B_EXL2_2.85bpw
Nano_Imp_1B_EXL3_4.0bpw
Impish_LLAMA_4B_EXL3_1.8bpw
Impish_Nemo_12B_EXL3_6.5bpw
Impish_Nemo_12B_EXL3_7.5bpw
Impish_Nemo_12B_GPTQ_4-bit-32
Fiendish_LLAMA_3B_ARM_HA
LLAMA-3_8B_Unaligned_BETA_FP8
Tenebra_30B_Alpha01_3BIT
Tenebra_30B_Alpha01_EXL2_3bpw
Tenebra_30B_Alpha01_GGUF_Collab
Tenebra_30B_Alpha01_EXL2_6bpw
Zion_Alpha_Instruction_Tuned_SLERP_GGUF
LLAMA-3_8B_Unaligned_Alpha_EXL2_6.5bpw
2B_or_not_2B-EXL2-4.0bpw
Dusk_Rainbow-EXL2-6.0-bpw
2B-ad-EXL2-4.0bpw
2B-ad-EXL2-7.0bpw
2B-ad-EXL2-8.0bpw
2B-ad_FP8
Impish_LLAMA_3B-EXL2-5.0bpw
LLAMA-3_8B_Unaligned_BETA_EXL2-3.5-bpw
LLAMA-3_8B_Unaligned_BETA_EXL2-4.0-bpw
LLAMA-3_8B_Unaligned_BETA_EXL2-7.0-bpw
Impish_Mind_8B-3.5bpw
Impish_Mind_8B-7.0bpw
Negative_LLAMA_70B-5.0bpw
Negative_LLAMA_70B-8.0bpw
Eximius_Persona_5B_FP8
Wingless_Imp_8B-3.5bpw
Wingless_Imp_8B-4.0bpw
Wingless_Imp_8B-6.0bpw
Impish_QWEN_7B-1M_FP8
Impish_QWEN_7B-1M-6.0bpw
Impish_QWEN_7B-1M-7.0bpw
Impish_QWEN_7B-1M-8.0bpw
Impish_LLAMA_3B_GPTQ
Redemption_Wind_24B_GPTQ
Negative_LLAMA_GPTQ_4-bit-32
Phi-lthy4-3.0bpw
Phi-lthy4-5.0bpw
Phi-lthy4-7.0bpw
Phi-Line_14B-3.0bpw
Phi-Line_14B-3.5bpw
Phi-Line_14B-4.0bpw
Phi-Line_14B-5.0bpw
Phi-Line_14B-6.0bpw
Phi-Line_14B-8.0bpw
Fiendish_LLAMA_3B-4.0bpw
Fiendish_LLAMA_3B-5.0bpw
Fiendish_LLAMA_3B-6.0bpw
Fiendish_LLAMA_3B_FP8
Nano_Imp_1B_GPTQ-4-bit-32
Impish_Magic_24B_EXL2_3.5bpw
Impish_Magic_24B_EXL2_3.75bpw
Impish_Magic_24B_EXL2_5.5bpw
Impish_Magic_24B_EXL2_6.5bpw
Impish_Magic_24B_EXL2_7.0bpw
Nano_Imp_1B_EXL3_1.5bpw
Nano_Imp_1B_EXL3_3.5bpw
Nano_Imp_1B_EXL3_3.0bpw
Negative_LLAMA_70B_EXL3_1.7bpw
Impish_LLAMA_4B_FP8
Impish_LLAMA_4B_EXL3_4.0bpw
Impish_LLAMA_4B_GPTQ_4-bit-32
Impish_LLAMA_4B_EXL3_2.0bpw
Impish_LLAMA_4B_EXL3_2.5bpw
Impish_LLAMA_4B_EXL3_3.0bpw
Impish_LLAMA_4B_EXL3_3.5bpw
Impish_LLAMA_4B_EXL3_5.0bpw
Impish_LLAMA_4B_EXL3_6.5bpw
Impish_LLAMA_4B_EXL3_7.0bpw
Impish_LLAMA_4B_EXL3_7.5bpw
Negative_LLAMA_70B_EXL3_1.2bpw
Negative_LLAMA_70B_EXL3_1.3bpw
Negative_LLAMA_70B_EXL3_1.4bpw
Note This was made to test extremely low quants. 70B is NOT USABLE at 1.4 bpw. I do not recommend using it.
Impish_LLAMA_4B_EXL3_c_2.0bpw
Negative_LLAMA_70B_EXL3_d_1.45bpw
Note This was made to test extremely low quants. 70B is almost usable at 1.45 bpw. A custom strategy for quanting was used, for details check EXL3 repo. This is borderline usable, the goal was to run 70b model under 16gb of vram. If you have 16GB VRAM use 4k context and it will fit, and sort of work. very borderline, but can be used.
Negative_LLAMA_70B_EXL3_d_2.15bpw
Impish_Nemo_12B_EXL3_5.5bpw
Impish_Nemo_12B_GPTQ_8-bit-128
Impish_LLAMA_3B_HA_NL
2B-ad_ARM_HA
invisietch_L3.3-Ignition-v0.1-70B_FP8
Dusk_Rainbow-EXL2-8.0-bpw
Negative_LLAMA_70B_GPTQ_4-bit-128
Tenebra_30B_Alpha01_EXL2_5bpw
Tenebra_30B_Alpha01_EXL2_6-50bpw
jukofyork_Dusk-Miqu-70B_EXL2_3.0bpw
jukofyork_Dusk-Miqu-70B_EXL2_3.5bpw
jukofyork_Dusk-Miqu-70B_EXL2_4.5bpw
jukofyork_Dusk-Miqu-70B_EXL2_5.5bpw
LLAMA-3_8B_Unaligned_Alpha_RP_Soup_EXL2_8.0bpw
LLAMA-3_8B_Unaligned_Alpha_EXL2_3.5bpw
LLAMA-3_8B_Unaligned_Alpha_EXL2_4.5bpw
LLAMA-3_8B_Unaligned_Alpha_EXL2_5.0bpw
LLAMA-3_8B_Unaligned_Alpha_EXL2_5.5bpw
LLAMA-3_8B_Unaligned_Alpha_EXL2_7.5bpw
LLAMA-3_8B_Unaligned_Alpha_EXL2_8.0bpw
2B_or_not_2B-EXL2-8.0bpw
2B_or_not_2B-EXL2-7.5bpw
2B_or_not_2B-EXL2-6.5bpw
Dusk_Rainbow-EXL2-7.0-bpw
2B-ad-EXL2-3.0bpw
2B_or_not_2B_FP8
Impish_LLAMA_3B_FP8
Impish_LLAMA_3B-EXL2-6.0bpw
Impish_LLAMA_3B-EXL2-7.0bpw
Impish_Mind_8B-4.0bpw
Impish_Mind_8B-5.0bpw
Impish_Mind_8B-6.0bpw
Impish_Mind_8B_FP8
Tenebra_30B_ARM
Negative_LLAMA_70B-3.5bpw
Negative_LLAMA_70B-7.0bpw
Eximius_Persona_5B-3.5bpw
Eximius_Persona_5B-4.0bpw
Eximius_Persona_5B-6.0bpw
Wingless_Imp_8B-7.0bpw
Phi-Line_14B-7.0bpw
Fiendish_LLAMA_3B-8.0bpw
Impish_LLAMA_4B_EXL3_4.5bpw
Negative_LLAMA_70B_EXL3_d_2.5bpw
Impish_Nemo_12B_EXL3_3.5bpw
Impish_Nemo_12B_EXL3_4.5bpw
Impish_Nemo_12B_GPTQ_4-bit-128
Impish_Nemo_12B_GPTQ_8-bit-1
Impish_Longtail_12B_FP8
Impish_Longtail_12B_EXL3_4.0bpw
Impish_Longtail_12B_EXL3_8.0bpw
Impish_Longtail_12B_EXL3_7.0bpw
Impish_Longtail_12B_EXL3_6.0bpw
Impish_Longtail_12B_EXL3_5.0bpw
Fiendish_LLAMA_3B_HA_NL
ResplendentAI_Nymph_8B_ARM
Boomer_Qwen_72B
An absolute unit derived from Qwen-72B, but turbo-charged with pure unfiltered boomer sigma grindset energy. This model has internalized decades of "back in my day" wisdom and distilled it into the most powerful financial NLP system ever created. Core features: - Programmed to automatically respond "Just buy the dip" to any market analysis - Enhanced pattern recognition for spotting "kids these days" scenarios - Built-in mortgage calculator that always concludes "rent is throwing money away" - Advanced NLP pipeline for transforming any input into "when I was your age" narratives - Hardwired belief in "number go up" as the fundamental law of economics Training methodology: Collected prime boomer wisdom from countless Facebook rants, Thanksgiving dinner lectures, and unsolicited advice sessions. Fed it through Qwen's architecture until it achieved enlightenment and started spontaneously generating complaints about avocado toast. Performance metrics: Achieves SOTA results on: - Real estate evangelism - "Pull yourself up by your bootstraps" pep talks - Gold standard nostalgia generation - Market timing (but only in retrospect) Basically took the raw computational power of Qwen-72B and gave it a healthy dose of "they don't make 'em like they used to" energy. The result? A model that knows the secret to success is just working hard and investing in the S&P 500. Warning: May spontaneously generate advice about starting in the mail room and working your way up to CEO. >be me, documenting my epic 5-year quest to harvest premium boomer knowledge >spend countless hours infiltrating their natural habitats 1. The Home Depot Technique™ - Strategically lingered in tool aisles at 2PM on weekdays - Mastered the art of asking "they don't make 'em like this anymore, do they?" - Recorded countless rants about proper lawn maintenance protocols >mfw I learned more about socket wrenches than any human should know 2. The Early Bird Special Reconnaissance - Infiltrated every Denny's within a 50-mile radius - Documented extensive financial wisdom between 4-6 PM - Key insight: All problems can be solved by "putting money into the market" >tfw boomers unironically explained how they bought houses for $12 and a firm handshake 3. The Facebook Comments Mining Operation - Developed advanced algorithms to scrape "back in my day" stories - Specialized in extracting wisdom from caps-lock rants about millennials - Discovered 47 unique variations of "kids these days don't want to work" 4. The Holiday Dinner Data Collection - Recorded thousands of hours of unsolicited advice - Specialized in capturing peak boomer wisdom during political arguments - Breakthrough discovery: Everything was better in 1965
Tinybra_13B_GPTQ_32g_4BIT
Dusk_Rainbow-EXL2-4.0-bpw
Phi-Line_14B_ARM
Fat_Fish
Impish_Bloodmoon_12B_GGUF
Angelic_Eclipse_12B_GGUF
Tenebra_30B_Alpha01_EXL2_8bpw
jukofyork_Dusk-Miqu-70B_EXL2_5.0bpw
LLAMA-3_8B_Unaligned_Alpha_RP_Soup_EXL2_5.0bpw
LLAMA-3_8B_Unaligned_Alpha_RP_Soup_EXL2_6.0bpw
LLAMA-3_8B_Unaligned_Alpha_RP_Soup_EXL2_7.0bpw
LLAMA-3_8B_Unaligned_Alpha_EXL2_3.0bpw
LLAMA-3_8B_Unaligned_Alpha_EXL2_4.0bpw
Dusk_Rainbow-EXL2-3.0-bpw
2B-ad-EXL2-5.0bpw
2B-ad-EXL2-6.0bpw
invisietch_Nimbus-Miqu-v0.1-70B-EXL2-3.0bpw
ZeusLabs_Chronos-Divergence-33B-EXL2-3.5bpw
ZeusLabs_Chronos-Divergence-33B-EXL2-7.5bpw
invisietch_Nimbus-Miqu-v0.1-70B_FP8
Dusk_Rainbow_FP8
Impish_LLAMA_3B-EXL2-4.0bpw
LLAMA-3_8B_Unaligned_BETA_EXL2-5.0-bpw
LLAMA-3_8B_Unaligned_BETA_EXL2-6.0-bpw
LLAMA-3_8B_Unaligned_BETA_EXL2-8.0-bpw
DeepSeek-Coder-V2-Instruct-FP8
Impish_LLAMA_6.84B_iMatrix
Negative_LLAMA_70B_FP8
Negative_LLAMA_70B-4.0bpw
Wingless_Imp_8B_FP8
Wingless_Imp_8B-8.0bpw
Impish_QWEN_14B-1M-8.0bpw
Redemption_Wind_24B_FP8
Phi-Line_14B_GPTQ
Negative_LLAMA_70B_EXL3_1.6bpw
Note This was made to test extremely low quants. 70B is NOT usable at 1.6 bpw. A custom strategy for quanting was used, for details check EXL3 repo.
Impish_Nemo_12B_GPTQ_4-bit-1
Impish_Nemo_12B_GPTQ_8-bit-32
Impish_Longtail_12B_GPTQ_4-bit-32
Blog_And_Updates
LLAMA-3_8B_Unaligned
Roleplay_Cards
Adventure_Cards
Qwen3.5-4B_Abliterated_GGUF
Assistant_Pepe_8B_GGUF
Impish_Bloodmoon_12B_LoRA
Impish_Bloodmoon_12B_ARM_HA
Angelic_Eclipse_12B_ARM_HA
CalderaAI_Foredoomed-9B_EXL-7.0
jukofyork_Dusk-Miqu-70B_EXL2_4.0bpw
ZeusLabs_Chronos-Divergence-33B-EXL2-4.0bpw
ZeusLabs_Chronos-Divergence-33B-EXL2-6.0bpw
SaisExperiments_Evil-Alpaca-3B-L3.2_GGUFs
PocketDoc_Dans-PersonalityEngine-V1.2.0-24b_FP8
Whisper_Large-V3_FP8
Whispering sweet nothings, sending a shiver down one's spine.
Stable-Diffusion_1.5_Collection
TTS_Charsi
A TorToiSe TTS model fully fine tuned on a single character. It works a bit faster too for some reason. (Still slow as a... well... a TorToiSe...)