BlinkDL

25 models • 3 total models in database
Sort by:

rwkv-4-raven

license:apache-2.0
0
502

rwkv-5-world

NaNK
license:apache-2.0
0
269

rwkv-4-world

RWKV-4 trained on 100+ world languages (70% English, 15% multilang, 15% code). World = SomePile + SomeRedPajama + SomeOSCAR + AllWikipedia + AllChatGPTDataIcanfind XXXtuned = finetune of World on MC4, OSCAR, wiki, etc. How to use: use https://github.com/josStorer/RWKV-Runner for GUI use latest rwkv pip package (0.8.0+) use https://github.com/BlinkDL/ChatRWKV/blob/main/v2/benchmarkworld.py and https://github.com/BlinkDL/ChatRWKV/blob/main/APIDEMOWORLD.py to test it The differences between World & Raven: set pipeline = PIPELINE(model, "rwkvvocabv20230424") instead of 20Btokenizer.json (EXACTLY AS WRITTEN HERE. "rwkvvocabv20230424" is included in rwkv 0.7.4+) use Question/Answer or User/AI or Human/Bot for chat. DO NOT USE Bob/Alice or Q/A For 0.1/0.4/1.5B models, use fp32 for first layer (will overflow in fp16 at this moment - fixable in future), or bf16 if you have 30xx/40xx GPUs. Example strategy: cuda fp32 1 -> cuda fp16 NOTE: the new greedy tokenizer (https://github.com/BlinkDL/ChatRWKV/blob/main/tokenizer/rwkvtokenizer.py) will tokenize '\n\n' as one single token instead of ['\n','\n']

license:apache-2.0
0
215

rwkv7-g1

These are BASE models (pretrained with web/code/synthetic + instruction/chat/reasoning data), suitable for post-training and fine-tuning (check https://huggingface.co/spaces/Jellyfish042/Uncheatabl...

NaNK
license:apache-2.0
0
175

rwkv-4-pile-14b

[UPDATE: Try RWKV-4-World (https://huggingface.co/BlinkDL/rwkv-4-world) for generation & chat & code in 100+ world languages, with great English zero-shot & in-context learning ability too.] RWKV-4 14B is a L40-D5120 causal language model trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details. RWKV-4-Pile-14B-2023xxxx-ctx8192-testxxx.pth : Fine-tuned to ctxlen 8192. The best general model. "Raven": RWKV alpaca+vicuna-style model: https://huggingface.co/BlinkDL/rwkv-4-raven (highly recommended) It is a strong chat model too. You can use +i for "Alpaca Instruct" in latest ChatRWKV v2. Examples: RWKV-4-Pile-14B-20230213-8019.pth : Trained on the Pile for 331B tokens Pile loss 1.7579 (ctxlen 1024) LAMBADA ppl 3.81, acc 71.05% PIQA acc 77.42% SC2016 acc 75.57% Hellaswag accnorm 70.24% WinoGrande acc 62.98%

NaNK
license:apache-2.0
0
173

rwkv-4-pile-7b

[UPDATE: Try RWKV-4-World (https://huggingface.co/BlinkDL/rwkv-4-world) for generation & chat & code in 100+ world languages, with great English zero-shot & in-context learning ability too.] RWKV-4 7B is a L32-D4096 causal language model trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details. RWKV-4-Pile-7B-20230109-ctx4096.pth : Fine-tuned to ctxlen 4096. Likely the best. Please test. "Raven": RWKV alpaca+vicuna-style model: https://huggingface.co/BlinkDL/rwkv-4-raven (highly recommended) It is a strong chat model too. You can use +i for "Alpaca Instruct" in latest ChatRWKV v2. Examples: RWKV-4-Pile-7B-20230xxx-ctx8192-testxxx : Fine-tuned to ctxlen 8192. Slightly weaker than ctx4096 model when ctxlen < 3k. RWKV-4-Pile-7B-20221115-8047.pth : Trained on the Pile for 332B tokens. Pile loss 1.8415T LAMBADA ppl 4.38, acc 67.18% PIQA acc 76.06% SC2016 acc 73.44% Hellaswag accnorm 65.51% Instruct-test models (OLD): only useful if you construct your prompt following dataset templates Note I am using "Q: instruct\n\nA: result" prompt for all instructs. RWKV-4-Pile-7B-Instruct-test1 instruct-tuned on https://huggingface.co/datasets/bigscience/xP3all/viewer/en/train RWKV-4-Pile-7B-Instruct-test2 instruct-tuned on https://huggingface.co/datasets/Muennighoff/flan & NIv2 RWKV-4-Pile-7B-EngChn-testNovel-xxx for writing Chinese novels (trained on 200G Chinese novels.)

NaNK
license:apache-2.0
0
158

Rwkv 6 World

Use rwkv pip package 0.8.24+ for RWKV-6 inference: https://pypi.org/project/rwkv/ (pipeline = PIPELINE(model, "rwkvvocabv20230424") for rwkv-world models) Online Demo 1: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-2 Online Demo 2: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-1 GUI: https://github.com/josStorer/RWKV-Runner (see Releases) For developer: https://github.com/BlinkDL/ChatRWKV/blob/main/APIDEMOCHAT.py RWKV-6 7B v3 MMLU = 54.2% (using the same "47.9%" code) RWKV-6 7B v2.1 MMLU = 47.9%: https://github.com/Jellyfish042/rwkvmmlu RWKV-6 0.1B (using pythia-160m tokenizer): https://huggingface.co/BlinkDL/temp-latest-training-models/blob/main/temp/rwkv-x060-173m-pile-20240515-ctx4k.pth RWKV-6 trained on 100+ world languages (70% English, 15% multilang, 15% code). World = SomePile + SomeSlimPajama + SomeStarCoder + SomeOSCAR + AllWikipedia + AllChatGPTDataIcanfind Recommended fine-tuning format (use \n for newlines): A good chat prompt (better replace \n\n in xxx to \n, such that there will be no newlines in xxx): QA prompt (better replace \n\n in xxx to \n, such that there will be no newlines in xxx): !!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!! !!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!! !!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!!

NaNK
license:apache-2.0
0
148

rwkv-7-world

NaNK
license:apache-2.0
0
107

rwkv-4-novel

Currently I am doing it for Chn novels. More languages to come. Use https://github.com/BlinkDL/ChatRWKV to run them. See https://github.com/BlinkDL/RWKV-LM for details on the RWKV Language Model (100% RNN). RWKV-4-Novel-ChnEng : 50% Chinese + 50% Pile RWKV-4-Novel-ChnEng-ChnPro : RWKV-4-Novel-ChnEng finetuned on high-quality professional Chn novels RWKV-4-Novel-Chn : 100% Chinese

license:apache-2.0
0
74

temp-latest-training-models

0
66

rwkv-4-music

license:apache-2.0
0
51

rwkv-4-pile-3b

NaNK
license:apache-2.0
0
42

rwkv-4-pileplus

license:apache-2.0
0
40

clip-guided-binary-autoencoder

license:apache-2.0
0
28

rwkv-4-pile-1b5

NaNK
license:apache-2.0
0
27

rwkv-6-misc

license:apache-2.0
0
23

rwkv-5-music

license:apache-2.0
0
20

Rwkv 8 Pile

RWKV-8 trained on the Pile w/ "20b tokenizer" (332115325534 tokens)

license:apache-2.0
0
18

rwkv-7-pile

license:apache-2.0
0
16

rwkv-4-pile-430m

license:apache-2.0
0
14

rwkv-4-pile-169m

license:apache-2.0
0
12

rwkv-3-pile-1b5

NaNK
license:apache-2.0
0
7

rwkv-2-pile-430m

license:apache-2.0
0
4

rwkv-3-pile-169m

license:apache-2.0
0
4

rwkv-3-pile-430m

license:apache-2.0
0
4