RossAscends

9 models • 1 total models in database

Sort by:

24B-XortronCriminalComputingConfig-3bpw-EXL3

EXL3 3bpw Quantization of https://huggingface.co/darkc0de/XortronCriminalComputingConfig Designed to be run on 12GB cards with Q8 cache and 8k context. This model turned out really well, intelligent, knowledgeable, and of course state-of-the-art Uncensored performance. This model will help you do anything and everything you probably shouldn't be doing. As of this writing, this model tops the UGI Leaderboard for models under 70 billion parameters in both the UGI and W10 categories. This is a merge of pre-trained language models created using mergekit. This model was merged using the TIES merge method using darkc0de/XortronCriminalComputing as a base. The following models were included in the merge: TroyDoesAI/BlackSheep-24B darkc0de/XortronCriminalComputing The following YAML configuration was used to produce this model:

NaNK

—

Magnum-Picaro-0.7-v2-12b-5.0bpw-EXL2

NaNK

license:apache-2.0

12B-Irix-Model-Stock-EXL3-3.5bpw

EXL3 3.5bpw Quant of https://huggingface.co/DreadPoor/Irix-12B-ModelStock Will fit on a 8GB card with 16k context with Q8 K/V cache.

NaNK

—

Llama_3_8B_8bpw_exl2

NaNK

llama

Paradigm_7B_6bpw_exl2

NaNK

license:cc-by-sa-4.0

12B-Trix-TEST-iQ4KS-GGUF

Original: https://huggingface.co/DreadPoor/Trix-TEST I saw it had a very interesting merge receipe, so I was eager to try it out even though it's not in a finished state. Can confirm it's a huge yapper. It can be contained somewhat by: - giving it a minimal system prompt of `Reply to the User.` - adding a Lorebook entry at depth 0 instructing it to respond concisely. I don't see any slop in the responses at all. Lots of potential here.

NaNK

license:mit

Mistral7B Dolphin2.1 LIMARP0.5 4bpw Exl2

ehartford's merge of Mistral 7B 0.1 with his Dolphin 2.1 dataset Purpose of the model is to be RP-focused, smart, fast, and lightweight for users with low VRAM. I've already built the exl2 4bpw quant (linked below), and it will run 8k ctx at around 6GB VRAM and respond to a full context at roughly 30tps (tested on my 3060) if exl2hf loader is used with FA2 enabled. Model has been tested by several users on the SillyTavern discord server, and run on Horde for a full day - with good results. full weights: https://huggingface.co/RossAscends/Mistral7BDolphin2.1LIMA0.5fp16

NaNK

license:mit

Mistral_7B_Dolphin2.1_LIMA0.5_fp16

NaNK

license:mit

Toppy-7B-5bpw-exl2

NaNK

license:cc-by-nc-4.0