crumb

85 models • 1 total models in database

Sort by:

doc2desc_3b_gguf

This is Qwen/Qwen2.5-3B tuned with the following format, on a mix of handwritten and Deepseek-V3 generated descriptions (few-shot w/ handwritten descriptions) for texts from https://textfiles.com, to make sure it has the ability to label unsafe content. This is being used to generate heaps of description/document pairs for training another model to do the reverse, for automatically generating documents to create control-vectors from. | Position | Delimiter | | --- | --- | | before user | `[[DOCUMENT]]` | | after user | `[[/DOCUMENT]]` | | before assistant | `[[DESCRIPTION]]` | after assistant | `[[/DESCRIPTION]]` you may also want to add "[[" as a stop string.. light tune, isn't perfect 😅 The outputs are something like informal summaries, for example, on the first element from the C4 dataset here are some outputs (at temperature 0.8): (outputs): > ad to get better at making delicious BBQ by world class bbq champion from lonestar smoke rangers. > ad for BBQ class at lonestar smoke rangers by world class bbq champ tony balay; includes techniques, recipes > event ad: beginners BBQ Class Taking Place in Missoula! from world class bbs champion tony balay

crumb

nano-mistral

gpt2023

apricot-wildflower-20

shrink-v1

bloom-560m-RLHF-SD2-prompter-aesthetic

alpha-wolf-dreambooth

doc2desc_3b_gguf

scaffold-18

bloom-560m-RLHF-SD2-prompter

eva-fusion-v2.22

icon-diffusion-v1-1

distilpythia

Ducky-MoMoe-prototype-e4-causal

gpt-joke

Llama-p-small

FLAN-OPT-6.7b-LoRA

fake-gpt-j-17m

opentinystories-30m-complex

minipile-111m

gpt-j-6b-finetune-super-glue

ColabInstruct-Z-1.1B

opentinystories-68m-complex

pico-gpt-j-6.7m

FLAN-OPT-1.3b-LoRA

FLAN-OPT-2.7b-LoRA

bespoke-gpt-124m

test-00-switchllama-i3b-f10b-e4-init

mixtral-e8-nano-1gt-test

GLORT2

utf8-gelu-dec-8.5M-10KB-ctx-3GB

distilpythia-cl

opentinystories-30m-base

switch-base-8-arxiv-abstraction

opentinystories-68m-base

llama2-7b-moe-text-exp2-4

llama2-7b-moe-text-exp3-4

test-00-qlora-wizmlpmix-c0

core1-base-464m-redpajama

gale-large-test

king-james-bible-gzip-16line-window

llama-d1024-slimpajama-1gt-test

llama-d1536-init

ParaLlama-p-small

13f189-augmented-mappings-medium-control

ptune-FLAN-OPT-6.7b

llama2-7b-moe-text-exp1-4

model-a-48.5m

ptune-FLAN-OPT-2.7b

cramped-94m-8btok

askmistral-2-15-111m

llama2-7b-shard-bf16

Ducky-MoMoe-prototype-e4-ul2

test-00-qlora-wizmlpmix-c1

test-00-qlora-wizmlpmix-c2

core1-base-464m-c4

44m-textbook

d1536-250MT-full

25m-special

qrstudy-410m-8-1

qrstudy-410m-16-1

qrstudy-410m-64-1

qrstudy-gpt2-4-8

qrstudy-gpt2-8-16

qrstudy-gpt2-16-32

king-james-bible-gzip-8line-window

king-james-bible-gzip-64line-window

shrink-init

ParaLlama-p-micro

gpt2-medium-eb49cc

160m-plus-sauce

Instruct-GPT-J

midjourney-textual-inversions

icon-diffusion-ckpt

essence-3b-v2

Gale-medium-init

genshin-stable-inversion

eva-model-ckpt

92d52f-ame-full-7B

gpt-j-6b-shakespeare