MarsupialAI

89 models • 1 total models in database
Sort by:

Cydonia-22B-v1_iMat_GGUF

NaNK
4,651
10

Monstral-123B-v2_GGUF

iMatrix GGUFs for https://huggingface.co/MarsupialAI/Monstral-123B-v2

NaNK
1,120
3

Llama3_GGUF_Quant_Testing

567
1

Gemmasutra-Mini-2B-v1_iMatrix_GGUF

NaNK
516
5

Big-Tiger-Gemma-27B-v1_iMatrix_GGUF

NaNK
224
7

Gemmasutra-Pro-27B-v1_iMatrix_GGUF

NaNK
license:cc-by-nc-4.0
218
7

Buddy-2B-v1_iMatrix_GGUF

NaNK
license:cc-by-nc-4.0
213
1

Hercules-Qwen1.5-14B_iMatrix_GGUF

NaNK
184
0

Rocinante-12B-v1_iMatrix_GGUF

NaNK
171
0

Cat-Llama-3-70B-instruct_iMatrix_GGUF

NaNK
license:llama3
168
0

Coomand-R-35B-v1_iMatrix_GGUF

NaNK
license:cc-by-nc-4.0
164
5

Celeste-12B-V1.6_iMatrix_GGUF

NaNK
license:apache-2.0
158
2

MG-FinalMix-72B_iMatrix_GGUF

NaNK
153
1

Fireplace-34b_iMatrix_GGUF

NaNK
llama
145
0

Moist-Miqu-70B-v1_iMatrix_GGUF

NaNK
138
2

llama-3-70B-Instruct-abliterated_iMatrix_GGUF

NaNK
license:llama3
136
0

Monstral-123B_iMat_GGUF

NaNK
117
2

IxChel-L3-12B_iMatrix_GGUF

NaNK
116
0

Llama-3SOME-8B-v1-BETA_iMatrix_GGUF

NaNK
113
2

Mistral-Dory-12B_iMatrix_GGUF

NaNK
license:apache-2.0
112
0

Mini-Magnum-Unboxed-12B_iMatrix_GGUF

NaNK
license:apache-2.0
107
0

KunoichiVerse-7B_iMatrix_GGUF

NaNK
license:apache-2.0
105
2

Magnum-12b-v2_iMatrix_GGUF

NaNK
license:apache-2.0
105
0

Lumimaid-v0.2-12B_iMatrix_GGUF

NaNK
license:cc-by-nc-4.0
102
1

Yi-34B-200k-v2_GGUF

NaNK
100
21

L3.1-8B-Celeste-V1.5_iMatrix_GGUF

NaNK
llama-factory
98
1

Lusca-33B_iMat_GGUF

NaNK
96
3

Foredoomed-9B_iMatrix_GGUF

NaNK
license:apache-2.0
96
1

Yi-34B-200K-RPMerge_GGUF

NaNK
91
1

Phi-3-mini-128k-instruct_iMatrix_GGUF

license:mit
76
0

Captain-Adventure-32B_iMat_GGUF

NaNK
71
0

Moistral-11B-v3_iMatrix_GGUF

NaNK
69
9

Nautilus-70B-v0.1_iMat_GGUF

NaNK
52
0

Llama-3.1-Nemotron-70B-Instruct_iMat_GGUF

NaNK
llama3.1
51
2

Monstral-123B-v2

NaNK
50
38

Qwen1.5-32B-Chat_iMatrix_GGUF

NaNK
48
2

Yi-9B-200K_iMatrix_GGUF

NaNK
47
0

Blossom-v5-32b_iMatrix_GGUF

NaNK
license:apache-2.0
42
0

Young-Children-Storyteller-Mistral-7B_iMatrix_GGUF

NaNK
license:apache-2.0
41
2

Buttocks-7B-v1.1_GGUF

NaNK
license:cc-by-nc-4.0
39
2

Pygmalion-2-13b_iMatrix_GGUF

NaNK
license:llama2
39
0

Garbage_9B_iMatrix_GGUF

NaNK
license:apache-2.0
38
0

Merged-RP-Stew-V2-34B_iMatrix_GGUF

NaNK
37
11

Yi-6B-200k-v2_GGUF

NaNK
36
2

Psyonic-Cetacean-20b-v2_iMatrix_GGUF

NaNK
33
1

Moistral-11B-v2.1b-SOGGY_iMatrix_GGUF

NaNK
32
1

Aqueducts-18B_iMatrix_GGUF

NaNK
license:cc-by-nc-4.0
30
3

Moistral-11B-v4_iMatrix_GGUF

NaNK
license:cc-by-nc-4.0
28
3

Qwen1.5-32B_iMatrix_GGUF

NaNK
28
1

Cydonia-22B-v1.3_iMat_GGUF

GGUF quants of https://huggingface.co/TheDrummer/Cydonia-22B-v1.3 iMatrix generated using Kalomaze's groupsmerged.txt

NaNK
25
3

SkunkApe-16b_iMatrix_GGUF

NaNK
22
0

SkunkApe-14b_iMatrix_GGUF

NaNK
19
0

Faro-Yi-34B-200K_iMatrix_GGUF

NaNK
license:mit
17
0

pippafeet-11B-0.2_iMatrix_GGUF

NaNK
16
0

KobbleTiny-1.1B_iMatrix_GGUF

NaNK
license:apache-2.0
16
0

aanaphi2-v0.1_GGUF

15
0

HelloNurse-11b_GGUF

NaNK
14
2

Melusine_103b_GGUF

NaNK
13
0

NorLlama-3B_GGUF

NaNK
license:cc-by-nc-sa-4.0
11
1

JerseyDevil-14b_iMatrix_GGUF

NaNK
10
0

KitchenSink_103b_iMatrix_GGUF

NaNK
license:llama2
9
1

Moistral-11B-v1_iMatrix_GGUF

NaNK
9
0

LaDameBlanche-v2-95b_iMatrix_GGUF

NaNK
5
3

Dumbstral-169B_GGUF

Q4KS GGUF for https://huggingface.co/MarsupialAI/Dumbstral-169B No imat, no other quant schemes. This is all I'm willing to do for a model that nobody can reasonably run. FSM help Bartowski and Mradermacher if they choose to run full quant sets for this bastard.

NaNK
5
1

KitchenSink_103b

NaNK
llama
4
9

Yeet 51b 200k

This model is a rotating-stack merge of three Yi 34b 200k models in a 51b (90 layer) configuration. My reasoning behind this merge was twofold: I'd never seen a stacked merge made from 34b models, and I thought that maybe this could give near-70b performance, but with a much larger context window while still fitting within 48GB of VRAM. I think the results are quite good. The model performs on par with many 70b models at RP, chat, and storywriting. At Q4KS it will fit into a pair of 24GB GPUs with 32k context. Coherency at 32k is excellent, and will probably remain very good well beyond that thanks to the 200k base training. The gotcha here is speed. While it inferences as you'd expect for the model size, it's much slower than a similarly-sized 8x7b MoE. And while I personally find the output of this model to outperform any mixtral finetune I've seen so far, those finetunes are getting better all the time, and this really is achingly slow with a lot of context. I'm getting less than half a token per second on a pair of P40s with a full 32k prompt. But that's not to say this model (or even the 51b stack concept) is useless. If you're patient, you can get extremely good output with very deep context on attainable hardware. There are undoubtedly niche scenarios where this model or similarly-constructed models might be ideal. Component models for the rotating stack are - adamo1139/Yi-34B-200K-AEZAKMI-v2 - brucethemoose/Yi-34B-200K-DARE-megamerge-v8 - taozi555/RpBird-Yi-34B-200k This model is uncensored and capable of generating objectionable material. However, it is not an explicitely-NSFW model, and it has never "gone rogue" and tried to insert NSFW content into SFW prompts in my experience. As with any LLM, no factual claims made by the model should be taken at face value. You know that boilerplate safety disclaimer that most professional models have? Assume this has it too. This model is for entertainment purposes only. FP16 and Q4KS GGUFs are located here: https://huggingface.co/MarsupialAI/Yeet51b200kGGUFQ4KSFP16 Prompt format Seems to work fine with Alpaca prompts. Considering the variety of components, other formats are likely to work to some extent. WTF is a rotating-stack merge? Inspired by Undi's experiments with stacked merges, Jeb Carter found that output quality and model initiative could be significantly improved by reversing the model order in the stack, and then doing a linear merge between the original and reversed stacks. That is what I did here. I created three passthrough stacked merges using the three source models (rotating the model order in each stack), and then doing a linear merge of all three stacks. The exact merge configs can be found in the recipe.txt file.

NaNK
llama
4
2

LaDameBlanche-103b_iMatrix_GGUF

NaNK
4
0

Monstral-123B

NaNK
3
28

Yeet_51b_200k_GGUF_Q4KS_FP16

NaNK
2
0

SkunkApe-16b

NaNK
llama
2
0

Cydonia-22B-v1.3_EXL2_4.5bpw

NaNK
2
0

Lusca-33B

NaNK
1
10

Dumbstral-169B

NaNK
1
3

IxChel-L3-12B

NaNK
llama
1
2

Monstral-123B_4.0bpw_EXL2

4bpw EXL2 quant of https://huggingface.co/MarsupialAI/Monstral-123B Default settings and dataset utilized for measurements.

NaNK
1
1

Cydonia-22B-v1.3_EXL2_5.5bpw_H8

NaNK
1
0

Monstral-123B_3.5bpw_EXL2

NaNK
1
0

LaDameBlanche-v2-95b

NaNK
llama
0
12

SkunkApe-14b

NaNK
llama
0
6

Aqueducts-18B

NaNK
license:cc-by-nc-4.0
0
5

HelloNurse-11b

NaNK
license:apache-2.0
0
3

LaDameBlanche-103b

NaNK
llama
0
3

JerseyDevil-14b

NaNK
llama
0
3

Moistral-11B-v4_EXL2

NaNK
license:cc-by-nc-4.0
0
3

Moistral-11B-v3_exl2

NaNK
0
2

Llama-3SOME-8B-v1-BETA_6.9bpw_exl2

NaNK
llama
0
1

Smegmma-9B-v1_elx2

NaNK
license:cc-by-nc-4.0
0
1

Cydonia-22B-v1_EXL2

NaNK
0
1

UnslopNemo-12B-v3_EXL2_6bpw_H8

6.0bpw EXL2 quant of https://huggingface.co/TheDrummer/UnslopNemo-12B-v3

NaNK
0
1