Lewdiculous

164 models • 8 total models in database
Sort by:

L3-8B-Stheno-v3.2-GGUF-IQ-Imatrix

My GGUF-IQ-Imatrix quants for Sao10K/L3-8B-Stheno-v3.2. Sao10K with Stheno again, another banger! I recommend checking his page for feedback and support. > [!IMPORTANT] > Quantization process: > For future reference, these quants have been done after the fixes from #6920 have been merged. > Imatrix data was generated from the FP16-GGUF and conversions directly from the BF16-GGUF. > This was a bit more disk and compute intensive but hopefully avoided any losses during conversion. > If you noticed any issues let me know in the discussions. > [!NOTE] > General usage: > Use the latest version of KoboldCpp. > For 8GB VRAM GPUs, I recommend the Q4KM-imat (4.89 BPW) quant for up to 12288 context sizes. > > Presets: > Some compatible SillyTavern presets can be found here (Virt's Roleplay Presets). > Check discussions such as this one for other recommendations and samplers. > [!TIP] > Personal-support: > I apologize for disrupting your experience. > Currently I'm working on moving for a better internet provider. > If you want and you are able to... > You can spare some change over here (Ko-fi). > > Author-support: > You can support the author at their own page. Click here for the original model card information. Support me here if you're interested: Ko-fi: https://ko-fi.com/sao10k wink Euryale v2? I have done a test run with multiple variations of the models, merged back to its base at various weights, different training runs too, and this Sixth iteration is the one I like most. Changes compared to v3.1 \- Included a mix of SFW and NSFW Storywriting Data, thanks to Gryphe \- Included More Instruct / Assistant-Style Data \- Further cleaned up Roleplaying Samples from c2 Logs -> A few terrible, really bad samples escaped heavy filtering. Manual pass fixed it. \- Hyperparameter tinkering for training, resulting in lower loss levels. Testing Notes - Compared to v3.1 \- Handles SFW / NSFW seperately better. Not as overly excessive with NSFW now. Kinda balanced. \- Better at Storywriting / Narration. \- Better at Assistant-type Tasks. \- Better Multi-Turn Coherency -> Reduced Issues? \- Slightly less creative? A worthy tradeoff. Still creative. \- Better prompt / instruction adherence.

NaNK
llama3
18,490
202

MN-12B-Lyra-v4-GGUF-IQ-Imatrix

NaNK
license:cc-by-nc-4.0
8,678
38

Captain-Eris_Violet-V0.420-12B-GGUF-ARM-Imatrix

NaNK
8,216
31

Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix

> [!IMPORTANT] > Updated! > Version (v2) files added! With imatrix data generated from the FP16 and conversions directly from the BF16. > This is a more disk and compute intensive so lets hope we get GPU inference support for BF16 models in llama.cpp. > Hopefully avoiding any losses in the model conversion, as has been the recently discussed topic on Llama-3 and GGUF lately. > If you are able to test them and notice any issues let me know in the discussions. > [!IMPORTANT] > Relevant: > These quants have been done after the fixes from llama.cpp/pull/6920 have been merged. > Use KoboldCpp version 1.64 or higher, make sure you're up-to-date. > [!TIP] > I apologize for disrupting your experience. > My upload speeds have been cooked and unstable lately. > If you want and you are able to... > You can support my various endeavors here (Ko-fi). GGUF-IQ-Imatrix quants for NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS. Author: "This model received the Orthogonal Activation Steering treatment, meaning it will rarely refuse any request." > [!WARNING] > Compatible SillyTavern presets here (simple) or here (Virt's Roleplay Presets - recommended). > Use the latest version of KoboldCpp. Use the provided presets for testing. > Feedback and support for the Authors is always welcome. > If there are any issues or questions let me know. > [!NOTE] > For 8GB VRAM GPUs, I recommend the Q4KM-imat (4.89 BPW) quant for up to 12288 context sizes. Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough. We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data. This model includes the new Luminae dataset from Ikari. This model have received the Orthogonal Activation Steering treatment, meaning it will rarely refuse any request. If you consider trying this model please give us some feedback either on the Community tab on hf or on our Discord Server. This repo contains FP16 files of Lumimaid-8B-v0.1-OAS. Training data used: - Aesir datasets - NoRobots - limarp - 8k ctx - toxic-dpo-v0.1-sharegpt - ToxicQAFinal - Luminae-i1 (70B/70B-alt) (i2 was not existing when the 70b started training) | Luminae-i2 (8B) (this one gave better results on the 8b) - Ikari's Dataset - Squish42/bluemoon-fandom-1-1-rp-cleaned - 50% (randomly) - NobodyExistsOnTheInternet/PIPPAsharegptv2test - 5% (randomly) - cgato/SlimOrcaDedupCleaned - 5% (randomly) - Airoboros (reduced) - Capybara (reduced) - Initial LumiMaid 8B Finetune - Undi95/Llama-3-Unholy-8B-e4 - Undi95/Llama-3-LewdPlay-8B IkariDev: Visit my retro/neocities style website please kek

NaNK
llama3
5,329
76

CaptainErisNebula-12B-Chimera-v1.1-GGUF-IQ-Imatrix

> [!TIP] > # GGUF quants for Nitral-AI/CaptainErisNebula-12B-Chimera-v1.1's recipe. > [!IMPORTANT] > Author recommended initial SillyTavern presets: > - Chimera: v1.1 > - (backup)(633788596ef1810cc9ea9abd375bbda3e568bd37) > [!NOTE] > ## This is an improvement on the previous experimental version. > - Not "chaotic", and at a usable size for most people seeking to perform inference locally with good speeds. > - The model does not show excessive alignment, so it should be good for most scenarios/writing situations. > - Feel free to use some light system prompting to nudge it out of a blocker if needed. > - It does well in adhering to characters and instructions. Thank you so much, "crazy chef" and "mad scientist", Nitral!

NaNK
4,593
12

Kunoichi-DPO-v2-7B-GGUF-Imatrix

> [!TIP] > Support: > My upload speeds have been cooked and unstable lately. > Realistically I'd need to move to get a better provider. > If you want and you are able to... > You can support my various endeavors here (Ko-fi). > I apologize for disrupting your experience. GGUF-Imatrix quantizations for SanjiWatsuki/Kunoichi-DPO-v2-7B. It stands for Importance Matrix, a technique used to improve the quality of quantized models. The Imatrix is calculated based on calibration data, and it helps determine the importance of different model activations during the quantization process. The idea is to preserve the most important information during quantization, which can help reduce the loss of model performance. One of the benefits of using an Imatrix is that it can lead to better model performance, especially when the calibration data is diverse. More information: [[1]](https://github.com/ggerganov/llama.cpp/discussions/5006) [[2]](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) If you want any specific quantization to be added, feel free to ask. `Base⇢ GGUF(F16)⇢ Imatrix-Data(F16)⇢ GGUF(Imatrix-Quants)` For --imatrix data, `imatrix-Kunoichi-DPO-v2-7B-F16.dat` was used. | Model | MT Bench | EQ Bench | MMLU | Logic Test | |----------------------|----------|----------|---------|-------------| | GPT-4-Turbo | 9.32 | - | - | - | | GPT-4 | 8.99 | 62.52 | 86.4 | 0.86 | | Kunoichi-DPO-v2-7B | 8.51 | 42.18 | 64.94| 0.58 | | Mixtral-8x7B-Instruct| 8.30 | 44.81 | 70.6 | 0.75 | | Kunoichi-DPO-7B | 8.29 | 41.60 | 64.83 | 0.59 | | Kunoichi-7B | 8.14 | 44.32 | 64.9 | 0.58 | | Starling-7B | 8.09 | - | 63.9 | 0.51 | | Claude-2 | 8.06 | 52.14 | 78.5 | - | | Silicon-Maid-7B | 7.96 | 40.44 | 64.7 | 0.54 | | Loyal-Macaroni-Maid-7B | 7.95 | 38.66 | 64.9 | 0.57 | | GPT-3.5-Turbo | 7.94 | 50.28 | 70 | 0.57 | | Claude-1 | 7.9 | - | 77 | - | | Openchat-3.5 | 7.81 | 37.08 | 64.3 | 0.39 | | Dolphin-2.6-DPO | 7.74 | 42.88 | 61.9 | 0.53 | | Zephyr-7B-beta | 7.34 | 38.71 | 61.4 | 0.30 | | Llama-2-70b-chat-hf | 6.86 | 51.56 | 63 | - | | Neural-chat-7b-v3-1 | 6.84 | 43.61 | 62.4 | 0.30 | | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench | |---|---:|---:|---:|---:|---:| | Kunoichi-DPO-7B|58.4| 45.08 | 74| 66.99| 47.52| | Kunoichi-DPO-v2-7B|58.31| 44.85| 75.05| 65.69| 47.65| | Kunoichi-7B|57.54| 44.99| 74.86| 63.72| 46.58| | OpenPipe/mistral-ft-optimized-1218| 56.85 | 44.74 | 75.6 | 59.89 | 47.17 | | Silicon-Maid-7B | 56.45| 44.74| 74.26| 61.5| 45.32| | mlabonne/NeuralHermes-2.5-Mistral-7B | 53.51 | 43.67 | 73.24 | 55.37 | 41.76 | | teknium/OpenHermes-2.5-Mistral-7B | 52.42 | 42.75 | 72.99 | 52.99 | 40.94 | | openchat/openchat3.5 | 51.34 | 42.67 | 72.92 | 47.27 | 42.51 | | berkeley-nest/Starling-LM-7B-alpha | 51.16 | 42.06 | 72.72 | 47.33 | 42.53 | | HuggingFaceH4/zephyr-7b-beta | 50.99 | 37.33 | 71.83 | 55.1 | 39.7 | | Model | AlpacaEval2 | Length | | --------------------------- | ----------- | ------ | | GPT-4 | 23.58% | 1365 | | GPT-4 0314 | 22.07% | 1371 | | Mistral Medium | 21.86% | 1500 | | Mixtral 8x7B v0.1 | 18.26% | 1465 | | Kunoichi-DPO-v2 | 17.19% | 1785 | | Claude 2 | 17.19% | 1069 | | Claude | 16.99% | 1082 | | Gemini Pro | 16.85% | 1315 | | GPT-4 0613 | 15.76% | 1140 | | Claude 2.1 | 15.73% | 1096 | | Mistral 7B v0.2 | 14.72% | 1676 | | GPT 3.5 Turbo 0613 | 14.13% | 1328 | | LLaMA2 Chat 70B | 13.87% | 1790 | | LMCocktail-10.7B-v1 | 13.15% | 1203 | | WizardLM 13B V1.1 | 11.23% | 1525 | | Zephyr 7B Beta | 10.99% | 1444 | | OpenHermes-2.5-Mistral (7B) | 10.34% | 1107 | | GPT 3.5 Turbo 0301 | 9.62% | 827 | | Kunoichi-7B | 9.38% | 1492 | | GPT 3.5 Turbo 1106 | 9.18% | 796 | | GPT-3.5 | 8.56% | 1018 | | Phi-2 DPO | 7.76% | 1687 | | LLaMA2 Chat 13B | 7.70% | 1513 |

NaNK
license:cc-by-nc-4.0
4,000
47

Nyanade_Stunna-Maid-7B-v0.2-GGUF-IQ-Imatrix

> [!TIP] > Support: > My upload speeds have been cooked and unstable lately. > Realistically I'd need to move to get a better provider. > If you want and you are able to... > You can support my various endeavors here (Ko-fi). > I apologize for disrupting your experience. #Roleplay #Multimodal #Vision #Based #Unhinged #Unaligned In this repository you can find GGUF-IQ-Imatrix quants for ChaoticNeutrals/NyanadeStunna-Maid-7B-v0.2 and if needed you can get some basic SillyTavern presets here, if you have issues with repetitiveness or lack or variety in responses I recommend changing the Temperature to 1.15, MinP to 0.075, RepPen to 1.15 and RepPenRange to 1024. > [!TIP] > Vision: > This is a #multimodal model that also has optional #vision capabilities. Expand the relevant sections bellow and read the full card information if you also want to make use that functionality. > > Quant options: > Reading bellow you can also find quant option recommendations for some common GPU VRAM capacities. "Unhinged RP with the spice of the previous 0.420 remixes, 32k context and vision capabilities." ⇲ Click here to expand/hide general common recommendations. Assuming a context size of 8192 for simplicity and 1GB of Operating System VRAM overhead with some safety margin to avoid overflowing buffers... For 11-12GB VRAM: A GPU with 11-12GB of VRAM capacity can comfortably use the Q6K-imat quant option and run it at good speeds. This is the same with or without using #vision capabilities. For 8GB VRAM: If not using #vision, for GPUs with 8GB of VRAM capacity the Q5KM-imat quant option will fit comfortably and should run at good speeds. If you are also using #vision from this model opt for the Q4KM-imat quant option to avoid filling the buffers and potential slowdowns. For 6GB VRAM: If not using #vision, for GPUs with 6GB of VRAM capacity the IQ3M-imat quant option should fit comfortably to run at good speeds. If you are also using #vision from this model opt for the IQ3XXS-imat quant option. ⇲ Click here to expand/hide more information about this topic. The latest of llama.cpp available at the time was used, with imatrix-with-rp-ex.txt as calibration data. ⇲ Click here to expand/hide more information about this topic. It stands for Importance Matrix, a technique used to improve the quality of quantized models. The Imatrix is calculated based on calibration data, and it helps determine the importance of different model activations during the quantization process. The idea is to preserve the most important information during quantization, which can help reduce the loss of model performance, especially when the calibration data is diverse. [[1]](https://github.com/ggerganov/llama.cpp/discussions/5006) [[2]](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) > [!NOTE] > For imatrix data generation, kalomaze's `groupsmerged.txt` with additional roleplay chats was used, you can find it here for reference. This was just to add a bit more diversity to the data with the intended use case in mind. ⇲ Click here to expand/hide how this would work in practice in a roleplay chat. ⇲ Click here to expand/hide how your SillyTavern Image Captions extension settings should look. > [!WARNING] > To use the multimodal capabilities of this model, such as vision, you also need to load the specified mmproj file, you can get it here or as uploaded in the mmproj folder in the repository. 1: Make sure you are using the latest version of KoboldCpp. 2: Load the mmproj file by using the corresponding section in the interface: 2.1: For CLI users, you can load the mmproj file by adding the respective flag to your usual command:

NaNK
2,450
58

Lumimaid-v0.2-8B-GGUF-IQ-Imatrix

My GGUF-IQ-Imatrix quants for NeverSleep/Lumimaid-v0.2-8B. I recommend checking their page for feedback and support. > [!IMPORTANT] > Quantization process: > Imatrix data was generated from the FP16-GGUF and conversions directly from the BF16-GGUF. > This is a bit more disk and compute intensive but hopefully avoids any losses during conversion. > To run this model, please use the latest version of KoboldCpp. > If you noticed any issues let me know in the discussions. > [!NOTE] > Presets: > Llama-3. > > Some compatible SillyTavern presets can be found here (Virt's Roleplay Presets - v1.9). > Check discussions such as this one and this one for other presets and samplers recommendations. > Lower temperatures are recommended by the authors, so make sure to experiment. > > General usage with KoboldCpp: > For 8GB VRAM GPUs, I recommend the Q4KM-imat (4.89 BPW) quant for up to 12288 context sizes without the use of `--quantkv`. > Using `--quantkv 1` (≈Q8) or even `--quantkv 2` (≈Q4) can get you to 32K context sizes with the caveat of not being compatible with Context Shifting, only relevant if you can manage to fill up that much context. > Read more about it in the release here. ⇲ Click here to expand/hide information – General chart with relative quant parformances. > [!NOTE] > Recommended read: > > "Which GGUF is right for me? (Opinionated)" by Artefact2 > > Click the image to view full size. > > [!TIP] > Personal-support: > I apologize for disrupting your experience. > Eventually I may be able to use a dedicated server for this, but for now hopefully these quants are helpful. > If you want and you are able to... > You can spare some change over here (Ko-fi). > > Author-support: > You can support the authors at their pages/here. This model is based on: Meta-Llama-3.1-8B-Instruct Wandb: https://wandb.ai/undis95/Lumi-Llama-3-1-8B?nw=nwuserundis95 Lumimaid 0.1 -> 0.2 is a HUGE step up dataset wise. As some people have told us our models are sloppy, Ikari decided to say fuck it and literally nuke all chats out with most slop. Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back! - Epiculous/Gnosis - ChaoticNeutrals/LuminousOpus - ChaoticNeutrals/Synthetic-Dark-RP - ChaoticNeutrals/Synthetic-RP - Gryphe/Sonnet3.5-SlimOrcaDedupCleaned - Gryphe/Opus-WritingPrompts - meseca/writing-opus-6k - meseca/opus-instruct-9k - PJMixers/grimulkantheory-of-mind-ShareGPT - NobodyExistsOnTheInternet/ToxicQAFinal - Undi95/toxic-dpo-v0.1-sharegpt - cgato/SlimOrcaDedupCleaned - kalomaze/OpusInstruct25k - Doctor-Shotgun/no-robots-sharegpt - Norquinal/claudemultiroundchat30k - nothingiisreal/Claude-3-Opus-Instruct-15K - All the Aesirs dataset, cleaned, unslopped - All le luminae dataset, cleaned, unslopped - Small part of Airoboros reduced We sadly didn't find the sources of the following, DM us if you recognize your set ! - OpusInstruct-v2-6.5K-Filtered-v2-sharegpt - claudesharegpttrimmed - CapybaraPureDecontaminated-ShareGPTreduced Datasets credits: - Epiculous - ChaoticNeutrals - Gryphe - meseca - PJMixers - NobodyExistsOnTheInternet - cgato - kalomaze - Doctor-Shotgun - Norquinal - nothingiisreal IkariDev: Visit my retro/neocities style website please kek

NaNK
llama3
2,351
42

Llama-3.1-8B-ArliAI-RPMax-v1.2-GGUF-IQ-ARM-Imatrix

NaNK
llama
1,403
12

L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix

NaNK
llama3
999
93

DS-R1-Qwen3-8B-ArliAI-RpR-v4-Small-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
979
15

Fimbulvetr-11B-v2-GGUF-IQ-Imatrix

NaNK
license:cc-by-nc-4.0
951
26

L3-8B-Stheno-v3.3-32K-GGUF-IQ-Imatrix

NaNK
llama3
852
39

Lumimaid-v0.2-12B-GGUF-IQ-Imatrix

NaNK
license:cc-by-nc-4.0
843
21

Erosumika 7B V3 0.2 GGUF IQ Imatrix

This repo contains GGUF-IQ-Imatrix quantized model files for Erosumika-7B-v3-0.2. "Q4KM", "Q4KS", "IQ4XS", "Q5KM", "Q5KS", "Q6K", "Q80", "IQ3M", "IQ3S", "IQ3XXS" It stands for Importance Matrix, a technique used to improve the quality of quantized models. The Imatrix is calculated based on calibration data, and it helps determine the importance of different model activations during the quantization process. The idea is to preserve the most important information during quantization, which can help reduce the loss of model performance, especially when the calibration data is diverse. [[1]](https://github.com/ggerganov/llama.cpp/discussions/5006) [[2]](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) For imatrix data generation, kalomaze's `groupsmerged.txt` with added roleplay chats was used, you can find it here. This was just to add a bit more diversity to the data. Model Details The Mistral 0.2 version of Erosumika-7B-v3, a DARE TIES merge between Nitral's Kunocchini-7b, Endevor's InfinityRP-v1-7B and my FlatErosAlpha, a flattened(in order to keep the vocab size 32000) version of tavtav's eros-7B-ALPHA. Alpaca and ChatML work best. Slightly smarter and better prompt comprehension than Mistral 0.1 Erosumika-7B-v3. 32k context should work. Limitations and biases The intended use-case for this model is fictional writing for entertainment purposes. Any other sort of usage is out of scope. It may produce socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive. Outputs might often be factually wrong or misleading.

NaNK
license:cc-by-4.0
687
28

Kunocchini-7b-128k-test-GGUF-Imatrix

NaNK
671
29

Violet_Magcap-12B-GGUF-IQ-Imatrix

NaNK
664
20

Captain-Eris_Twighlight-V0.420-12B-GGUF-ARM-Imatrix

> [!TIP] > Updated! > Please grab "v2" quants remade with the new tokenizer settings to fix the endless generation issues. > [!NOTE] > SillyTavern > The complete AIO recommended preset: > v2-SillyTavern-Presets-AIO-2024-12-28.json My GGUF-ARM-Imatrix quants of Captain-ErisTwighlight-V0.420-12B.

NaNK
632
7

Aura_L3_8B-GGUF-IQ-Imatrix

NaNK
llama3
627
11

Llama-3.1-8B-Stheno-v3.4-GGUF-IQ-Imatrix

NaNK
llama3
607
21

Aura_Uncensored_l3_8B-GGUF-IQ-Imatrix

NaNK
llama3
587
25

MN-BackyardAI-Party-12B-v1-GGUF-IQ-ARM-Imatrix

NaNK
license:cc-by-nc-4.0
523
16

Eris_PrimeV3-Vision-7B-GGUF-IQ-Imatrix

NaNK
471
14

Mistral-Nemo-12B-ArliAI-RPMax-v1.2-GGUF-IQ-ARM-Imatrix

NaNK
468
11

Llama 3.1 8B ArliAI Formax V1.0 GGUF IQ ARM Imatrix

My quants for ArliAI/Llama-3.1-8B-ArliAI-Formax-v1.0. "Formax is a model that specializes in following response format instructions. Tell it the format of it's response and it will follow it perfectly. Great for data processing and dataset creation tasks." "It is also a highly uncensored model that will follow your instructions very well."

NaNK
base_model:ArliAI/Llama-3.1-8B-ArliAI-Formax-v1.0
421
5

Poppy Porpoise 0.72 L3 8B GGUF IQ Imatrix

> [!TIP] > Support: > My upload speeds have been cooked and unstable lately. > Realistically I'd need to move to get a better provider. > If you want and you are able to, you can support that endeavor and others here (Ko-fi). I apologize for disrupting your experience. GGUF-IQ-Imatrix quants for ChaoticNeutrals/PoppyPorpoise-0.72-L3-8B. "Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences. Recomended ST Presets:(Updated for 0.72) Porpoise Presets # To use the multimodal capabilities of this model and use vision you need to load the specified mmproj file, this can be found inside this model repo. Llava MMProj You can load the mmproj by using the corresponding section in the interface:

NaNK
417
40

CaptainErisNebula-12B-AOE-v1-GGUF-IQ-Imatrix

NaNK
411
5

MN-12B-Mag-Mell-R1-GGUF-IQ-ARM-Imatrix

NaNK
394
12

Llama-3-Soliloquy-8B-v2-GGUF-IQ-Imatrix

NaNK
license:cc-by-nc-sa-4.0
343
19

L3-TheSpice-8b-v0.8.3-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
318
17

llama-3-Stheno-Mahou-8B-GGUF-IQ-Imatrix

NaNK
llama3
316
22

Eris_PrimeV3.05-Vision-7B-GGUF-IQ-Imatrix

NaNK
314
5

Halu-8B-Llama3-Blackroot-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
301
7

L3-TheSpice-8b-v0.1.3-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
295
12

Azure_Dusk-v0.2-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
290
6

Visual-LaylelemonMaidRP-7B-GGUF-IQ-Imatrix

NaNK
284
12

Aura_v2_7B-GGUF-IQ-Imatrix

NaNK
267
16

BuRP_7B-GGUF-IQ-Imatrix

NaNK
256
26

Average_Normie_v3.69_8B-GGUF-IQ-Imatrix

NaNK
255
15

llama3-8B-DarkIdol-1.0-GGUF-IQ-Imatrix-Request

NaNK
license:apache-2.0
255
9

Infinitely-Laydiculous-9B-GGUF-IQ-Imatrix

NaNK
254
16

Qwen2-7B-Instruct-abliterated-GGUF-IQ-Imatrix-Request

NaNK
license:apache-2.0
252
6

Eris_PrimeV4-Vision-32k-7B-GGUF-IQ-Imatrix

NaNK
245
14

Irix-12B-Reasoner-v.0.2-GGUF-IQ-Imatrix

NaNK
244
3

Poppy_Porpoise-v0.7-L3-8B-GGUF-IQ-Imatrix

NaNK
llama3
208
30

Captain-Eris_Violet-GRPO-v0.420-GGUF-IQ-Imatrix

Hello, travelers! These are my GGUF-IQ-Imatrix quants of Captain-ErisViolet-GRPO-v0.420. > [!TIP] > Discussions > - General discussion and author feedback. > Feedback is always welcome for potential issues with quants and as a way to guide the author in the future iterations. Your comments for them are appreciated! > [!NOTE] > SillyTavern > - [[SillyTavern Presets]](https://huggingface.co/Lewdiculous/Captain-ErisViolet-GRPO-v0.420-GGUF-IQ-Imatrix/tree/main/SillyTavern) > Initially recommended master-import presets.

NaNK
205
13

InfinityRP-v2-8B-GGUF-IQ-Imatrix

NaNK
llama
199
11

llama3-8B-aifeifei-1.1-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
198
2

SOVL_Llama3_8B-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
197
27

opus-v1.2-7b-GGUF-IQ-Imatrix

NaNK
193
6

firefly-gemma-7b-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
193
2

Kukul-Stanta-0.420-32k-7B-0.2-GGUF-IQ-Imatrix

NaNK
191
5

Orthocopter_8B-GGUF-Imatrix

NaNK
llama
190
12

InfinityRP-v1-7B-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
184
44

L3-Umbral-Mind-RP-v1.0-8B-GGUF-IQ-Imatrix

NaNK
183
18

RP_Vision_7B-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
180
6

Violet_Twilight-v0.2-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
177
17

Chaos_RP_l3_8B-GGUF-IQ-Imatrix

NaNK
llama3
162
21

Test1_SLIDE-GGUF-IQ-Imatrix

license:apache-2.0
162
2

llama3-8B-aifeifei-1.0-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
159
1

lwd-Mirau-7b-RP-Merged-GGUF-IQ-Imatrix

NaNK
155
7

Bungo-L3-8B-GGUF-IQ-Imatrix-Request

NaNK
154
15

Nyanade_Stunna-Maid-7B-GGUF-IQ-Imatrix

NaNK
147
11

Nina-v2-7B-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
147
5

llama-3-cat-8b-instruct-v1-GGUF-IQ-Imatrix

NaNK
llama3
145
18

mini-magnum-12b-v1.1-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
144
16

Loyal-Toppy-Bruins-Maid-7B-DARE-GGUF-Imatrix

NaNK
license:cc-by-nc-4.0
142
16

L3.1-8B-Niitama-v1.1-GGUF-IQ-Imatrix

NaNK
142
14

Poppy_Porpoise-1.0-L3-8B-GGUF-IQ-Imatrix

NaNK
llama3
140
15

Eris-Lelanacles-7b-GGUF-IQ-Imatrix

NaNK
140
4

Llama-3-8B-Irene-v0.1-GGUF-IQ-Imatrix

NaNK
140
3

mistral-7b-v0.1-layla-v4-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
139
8

Mahou-1.2-llama3-8B-GGUF-IQ-Imatrix

NaNK
llama
137
6

Eris_Remix_7B-GGUF-IQ-Imatrix

NaNK
134
11

flammen13-mistral-7B-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
132
3

LLaMa-3-CursedStock-v1.8-8B-GGUF-IQ-Imatrix-Request

NaNK
license:apache-2.0
131
8

Prima-LelantaclesV5-7b-GGUF

NaNK
131
4

Puppy_Purpose_0.69-GGUF-IQ-Imatrix

llama3
130
4

BuRPInfinity_9B-GGUF-IQ-Imatrix

NaNK
128
5

Captain-Eris-Diogenes_Twilight-V0.420-12B-GGUF-ARM-Imatrix

NaNK
125
18

llama3-8B-aifeifei-1.3-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
125
4

Eris_Floramix_DPO_7B-GGUF-Imatrix

NaNK
122
4

Hathor-L3-8B-v.01-GGUF-IQ-Imatrix

NaNK
121
8

DaturaCookie_7B-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
118
5

ogno-monarch-jaskier-merge-7b-OH-PREF-DPO-GGUF-IQ-Imatrix

NaNK
115
5

Eris-Daturamix-7b-v2-GGUF-IQ-Imatrix

NaNK
110
6

Rawr_Llama3_8B-GGUF-IQ-Imatrix

NaNK
106
5

Paradigm_Shift_7B-GGUF-IQ-Imatrix

NaNK
license:cc-by-sa-4.0
105
6

Poppy_Porpoise-v0.4-L3-8B-GGUF-IQ-Imatrix

NaNK
llama3
104
9

Eris_PrimeV4.69-Vision-32k-7B-GGUF-Imatrix

NaNK
103
6

Erosumika-7B-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
101
9

Eris_PrimeV3.075-Vision-7B-GGUF-IQ-Imatrix-Test

NaNK
99
3

Layris_9B-GGUF-IQ-Imatrix

NaNK
97
20

Nera_Noctis-12B-GGUF-ARM-Imatrix

NaNK
97
6

FuseChat-Kunoichi-10.7B-GGUF-IQ-Imatrix

NaNK
95
8

Average_Normie_l3_v1_8B-GGUF-IQ-Imatrix

NaNK
llama3
94
13

Prima-LelantaclesV6-7b-GGUF-IQ-Imatrix

NaNK
94
5

Poppy_Porpoise-v0.2-L3-8B-GGUF-IQ-Imatrix

NaNK
llama3
93
18

experimental-lwd-Mirau-RP-14B-GGUF-IQ-Imatrix

NaNK
90
10

Neural-SOVLish-Devil-8B-L3-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
87
6

llama3-8B-aifeifei-1.2-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
87
2

Eris_7B-GGUF-IQ-Imatrix

NaNK
86
5

phencyclidine-8b-v1-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
85
2

llama3-8B-feifei-1.0-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
85
1

KukulStanta-7B-GGUF-IQ-Imatrix

NaNK
84
9

opus-v1.2-llama-3-8b-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
84
5

Eris-Beach_Day-7b-GGUF-IQ-Imatrix

NaNK
83
4

Eris-Daturamix-7b-GGUF-IQ-Imatrix

NaNK
81
4

Infinitely-Laydiculous-7b-longtext-GGUF-IQ-Imatrix

NaNK
80
7

RoleBeagle-11B-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
78
8

Eris_PrimeV4.20-Vision-32k-7B-GGUF-IQ-Imatrix

NaNK
78
7

Infinitely-Laydiculous-7B-GGUF-IQ-Imatrix

NaNK
74
7

Kool-Aid_7B-GGUF-IQ-Imatrix

NaNK
74
6

WestLake-10.7B-v2-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
71
9

Aurora_l3_8B-GGUF-IQ-Imatrix

NaNK
llama3
69
8

InfinityNexus_9B-GGUF-IQ-Imatrix

NaNK
69
6

Paradigm_7B-GGUF-IQ-Imatrix

NaNK
68
4

DarkSapling-7B-v2.0-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
68
4

Kunocchini-1.2-7b-longtext-GGUF-Imatrix

NaNK
67
4

Persephone_7B-GGUF-IQ-Imatrix

NaNK
63
4

Datura_7B-GGUF-Imatrix

NaNK
61
6

Eris_PrimeV3.075-Vision-7B-Longtext-test-GGUF-IQ-Imatrix-Test

NaNK
60
3

LemonadeRP-4.5.3-GGUF-IQ-Imatrix

license:cc-by-4.0
59
14

Copium-Cola-9B-GGUF-IQ-Imatrix

NaNK
59
7

Test0_SLIDE-GGUF-IQ-Imatrix

license:cc-by-nc-4.0
59
2

flammen10-mistral-7B-GGUF-IQ-Imatrix-Testing

NaNK
license:apache-2.0
57
2

Prima-LelantaclesV6.69-7b-GGUF-IQ-Imatrix

NaNK
54
5

Aura_7B-GGUF-IQ-Imatrix

NaNK
54
5

Bepis_9B-GGUF-IQ-Imatrix

NaNK
53
6

Elly_7B-GGUF-IQ-Imatrix

NaNK
51
4

Test2_SLIDE-GGUF-IQ-Imatrix

license:apache-2.0
51
2

Poppy_Porpoise-v0.6-L3-8B-GGUF-IQ-Imatrix

NaNK
llama3
50
9

Erosumika-7B-v2-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
50
4

Asherah_7B-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
49
7

Multi-Verse-RP-7B-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
49
3

InfiniteBuRP_7B-GGUF-IQ-Imatrix

NaNK
47
4

kuno-kunoichi-v1-DPO-v2-SLERP-7B-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
45
10

duloxetine-4b-v1-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
45
6

Moistral-11B-v2-GGUF-IQ-Imatrix-Testing

NaNK
44
2

Sinerva_7B-GGUF-IQ-Imatrix

NaNK
44
1

TheSpice-7b-v0.1.1-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
44
1

EndlessRP-v3-7B-GGUF-Imatrix

NaNK
license:apache-2.0
43
12

Prodigy_7B-GGUF-Imatrix

NaNK
43
7

Eris_PrimeV4-Vision-7B-GGUF-IQ-Imatrix

NaNK
43
5

Mika-Longtext-7b-GGUF-IQ-Imatrix

NaNK
43
4

Prima-LelantaclesV7-experimental-7b-GGUF-IQ-Imatrix

NaNK
42
2

kukulemon-7B-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
41
8

Prima-LelantaclesV4-7b-16k-GGUF

NaNK
40
6

Sonya-7B-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
40
4

flammen18X-mistral-7B-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
38
5

T.E-8.1-GGUF-IQ-Imatrix-Request

NaNK
license:cc-by-nc-4.0
38
5

Prima-LelantaclesV6.3-7b-GGUF-IQ-Imatrix

NaNK
36
4

mistral-7b-v0.2-layla-v4-GGUF-IQ-Imatrix

NaNK
license:apache-2.0
35
3

Eris-Prime-Punch-9B-GGUF-IQ-Imatrix

NaNK
34
8

rogue-enchantress-32k-7B-GGUF-IQ-Imatrix

NaNK
license:cc-by-nc-4.0
32
5

Pasta-Lake-7b-GGUF

NaNK
30
5

Mika-Lelantacles-7b-Longtext-GGUF-IQ-Imatrix

NaNK
30
3

InfinityNoodleRP-7b-GGUF-IQ-Imatrix

NaNK
29
3

Irene-RP-v2-7B-GGUF-IQ-Imatrix

NaNK
24
5

lwd-Mirau-7b-RP-Merged

> [!NOTE] > LoRA by mouseEliauk: > https://modelscope.cn/models/mouseEliauk/mirau-7b-RP-base > [!TIP] > Experimental quants for testing: > lwd-Mirau-7b-RP-Merged-GGUF-IQ-Imatrix Introduction mirau-7b-RP-base is a first-person narrative language model that transforms simple user actions into vivid storytelling, complete with environmental descriptions, psychological activities, and plot progression. I call this concept "Action-to-Narrative Render" - a way to render actions into immersive narratives. To ensure coherent storytelling, I developed a unique training method called "story flow chain of thought". In essence, it enables the model to weave each user input with previous context, creating a continuous narrative flow. This makes it perfect for text-based adventures, mystery stories, or simply exploring your imagination. You can have a try at modelscope:mirau-RP-7b-base-demo ⚠️ Important Notes This is a base version model - note that "base" here doesn't refer to a traditional pretrained base model, but rather indicates that this version: - Only supports first-person narrative perspective - Not suitable for dialogue interactions (outputs may be unstable) - Best used for single-character narrative experiences Input Types The model accepts various input commands, marked with parentheses (): 1. Basic Actions: Simple, everyday behaviors - Examples: `(I put on my clothes)`, `(I take a sip of water)`, `(I sit down)` 2. Exploration Actions: Interactions with the environment - Examples: `(I look around)`, `(I approach the wooden box)`, `(I push open the door)` 3. Inner Monologue: Character's thoughts and feelings - Examples: `(What's wrong here?)`, `(This feels strange)` 4. Observation: Focused attention on specific objects or scenes - Examples: `(examine the wooden box closely)`, `(listen to the surrounding sounds)` bash pip install ms-swift[llm] -U bash RAYmemorymonitorrefreshms=0 CUDAVISIBLEDEVICES=0 swift deploy\ --modeltype qwen25\ --model qwen/Qwen2.5-7B-Instruct\ --adapters mirau=mirau-7b-RP-base\ # the Lora you download --inferbackend vllm --maxbatchsize 1\ --maxlength 8192 \ --maxmodellen 8192 \ --port 8886 \ --host 0.0.0.0\ --vllmmaxlorarank 128\ --mergelora true # This merges the LoRA into the model, resulting in approximately 14GB of weights ```

NaNK
21
12

llama.cpp-11077-test-01

19
1

SLIDE-v2-GGUF-IQ-Imatrix

18
2

fireblossom-32K-7B-GGUF-IQ-Imatrix

NaNK
license:cc-by-4.0
15
2

Model-Requests

> [!IMPORTANT] > # Status: > Quant-Requests are PAUSED momentarily due to external circumstances. > I sincerely apologize for disrupting your experience! Only if you want to and you are able... > You can support my personal endeavours here (Ko-fi). > Eventually I want to be able set aside resources for a dedicated infrastructure. > In the meantime, I'll be working to provide whenever possible with the resources available at the time. --> [!TIP] > Quant-Requests are open. > I apologize for disrupting your experience. > Only if you want to and you are able... > You can support my personal endeavours here (Ko-fi). > Eventually I want to be able set aside resources for a dedicated infrastructure. > In the meantime, I'll be working to provide whenever possible with the resources available at the time. Welcome to my GGUF-IQ-Imatrix Model Quantization Requests card! This card is meant only to request GGUF-IQ-Imatrix quants for models that meet the requirements below. Requirements to request GGUF-Imatrix model quantizations: For the model: - Maximum model parameter size of ~~11B~~ 12B. Small note is that models sizes larger than 8B parameters may take longer to process and upload than the smaller ones. At the moment I am unable to accept requests for larger models due to hardware/time limitations. Preferably for Mistral and LLama-3 based models in the creative/roleplay niche. If you need quants for a bigger model, you can try requesting at mradermacher's. He's doing an amazing work. Important: - Fill the request template as outlined in the next section. 1. Open a New Discussion titled "`Request: Model-Author/Model-Name`", for example, "`Request: Nitral-AI/Infinitely-Laydiculous-7B`", without the quotation marks. 2. Include the following template in your new discussion post, you can just copy and paste it as is, and fill the required information by replacing the {{placeholders}} (example request here):

license:apache-2.0
0
44