beomi

79 models โ€ข 2 total models in database
Sort by:

KcELECTRA-base-v2022

๐Ÿšจ Important Note: This REPO is DEPRECATED since KcELECTRA-base v2023 Released ๐Ÿšจ USE `https://huggingface.co/beomi/KcELECTRA-base` and `v2022` Revision if needed. - KcELECTRA-base-v2022 (๊ตฌ v2022-dev) ๋ชจ๋ธ ์ด๋ฆ„์ด ๋ณ€๊ฒฝ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. - ์œ„ ๋ชจ๋ธ์˜ ์„ธ๋ถ€ ์Šค์ฝ”์–ด๋ฅผ ์ถ”๊ฐ€ํ•˜์˜€์Šต๋‹ˆ๋‹ค. - ๊ธฐ์กด KcELECTRA-base(v2021) ๋Œ€๋น„ ๋Œ€๋ถ€๋ถ„์˜ downstream task์—์„œ ~1%p ์ˆ˜์ค€์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ณต๊ฐœ๋œ ํ•œ๊ตญ์–ด Transformer ๊ณ„์—ด ๋ชจ๋ธ๋“ค์€ ๋Œ€๋ถ€๋ถ„ ํ•œ๊ตญ์–ด ์œ„ํ‚ค, ๋‰ด์Šค ๊ธฐ์‚ฌ, ์ฑ… ๋“ฑ ์ž˜ ์ •์ œ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ํ•œํŽธ, ์‹ค์ œ๋กœ NSMC์™€ ๊ฐ™์€ User-Generated Noisy text domain ๋ฐ์ดํ„ฐ์…‹์€ ์ •์ œ๋˜์ง€ ์•Š์•˜๊ณ  ๊ตฌ์–ด์ฒด ํŠน์ง•์— ์‹ ์กฐ์–ด๊ฐ€ ๋งŽ์œผ๋ฉฐ, ์˜คํƒˆ์ž ๋“ฑ ๊ณต์‹์ ์ธ ๊ธ€์“ฐ๊ธฐ์—์„œ ๋‚˜ํƒ€๋‚˜์ง€ ์•Š๋Š” ํ‘œํ˜„๋“ค์ด ๋นˆ๋ฒˆํ•˜๊ฒŒ ๋“ฑ์žฅํ•ฉ๋‹ˆ๋‹ค. KcELECTRA๋Š” ์œ„์™€ ๊ฐ™์€ ํŠน์„ฑ์˜ ๋ฐ์ดํ„ฐ์…‹์— ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด, ๋„ค์ด๋ฒ„ ๋‰ด์Šค์—์„œ ๋Œ“๊ธ€๊ณผ ๋Œ€๋Œ“๊ธ€์„ ์ˆ˜์ง‘ํ•ด, ํ† ํฌ๋‚˜์ด์ €์™€ ELECTRA๋ชจ๋ธ์„ ์ฒ˜์Œ๋ถ€ํ„ฐ ํ•™์Šตํ•œ Pretrained ELECTRA ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ๊ธฐ์กด KcBERT ๋Œ€๋น„ ๋ฐ์ดํ„ฐ์…‹ ์ฆ๊ฐ€ ๋ฐ vocab ํ™•์žฅ์„ ํ†ตํ•ด ์ƒ๋‹นํ•œ ์ˆ˜์ค€์œผ๋กœ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. KcELECTRA๋Š” Huggingface์˜ Transformers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ†ตํ•ด ๊ฐ„ํŽธํžˆ ๋ถˆ๋Ÿฌ์™€ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (๋ณ„๋„์˜ ํŒŒ์ผ ๋‹ค์šด๋กœ๋“œ๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.) - Finetune ์ฝ”๋“œ๋Š” https://github.com/Beomi/KcBERT-finetune ์—์„œ ์ฐพ์•„๋ณด์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. - ํ•ด๋‹น Repo์˜ ๊ฐ Checkpoint ํด๋”์—์„œ Step๋ณ„ ์„ธ๋ถ€ ์Šค์ฝ”์–ด๋ฅผ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. | | Size (์šฉ๋Ÿ‰) | NSMC (acc) | Naver NER (F1) | PAWS (acc) | KorNLI (acc) | KorSTS (spearman) | Question Pair (acc) | KorQuaD (Dev) (EM/F1) | | :----------------- | :-------------: | :----------------: | :--------------------: | :----------------: | :------------------: | :-----------------------: | :-------------------------: | :---------------------------: | | KcELECTRA-base-v2022 | 475M | 91.97 | 87.35 | 76.50 | 82.12 | 83.67 | 95.12 | 69.00 / 90.40 | | KcELECTRA-base | 475M | 91.71 | 86.90 | 74.80 | 81.65 | 82.65 | 95.78 | 70.60 / 90.11 | | KcBERT-Base | 417M | 89.62 | 84.34 | 66.95 | 74.85 | 75.57 | 93.93 | 60.25 / 84.39 | | KcBERT-Large | 1.2G | 90.68 | 85.53 | 70.15 | 76.99 | 77.49 | 94.06 | 62.16 / 86.64 | | KoBERT | 351M | 89.63 | 86.11 | 80.65 | 79.00 | 79.64 | 93.93 | 52.81 / 80.27 | | XLM-Roberta-Base | 1.03G | 89.49 | 86.26 | 82.95 | 79.92 | 79.09 | 93.53 | 64.70 / 88.94 | | HanBERT | 614M | 90.16 | 87.31 | 82.40 | 80.89 | 83.33 | 94.19 | 78.74 / 92.02 | | KoELECTRA-Base | 423M | 90.21 | 86.87 | 81.90 | 80.85 | 83.21 | 94.20 | 61.10 / 89.59 | | KoELECTRA-Base-v2 | 423M | 89.70 | 87.02 | 83.90 | 80.61 | 84.30 | 94.72 | 84.34 / 92.58 | | KoELECTRA-Base-v3 | 423M | 90.63 | 88.11 | 84.45 | 82.24 | 85.53 | 95.25 | 84.83 / 93.45 | | DistilKoBERT | 108M | 88.41 | 84.13 | 62.55 | 70.55 | 73.21 | 92.48 | 54.12 / 77.80 | \config์˜ ์„ธํŒ…์„ ๊ทธ๋Œ€๋กœ ํ•˜์—ฌ ๋Œ๋ฆฐ ๊ฒฐ๊ณผ์ด๋ฉฐ, hyperparameter tuning์„ ์ถ”๊ฐ€์ ์œผ๋กœ ํ•  ์‹œ ๋” ์ข‹์€ ์„ฑ๋Šฅ์ด ๋‚˜์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. - `pytorch ~= 1.8.0` - `transformers ~= 4.11.3` - `emoji ~= 0.6.0` - `soynlp ~= 0.0.493` > ๐Ÿ’ก ์ด์ „ KcBERT ๊ด€๋ จ ์ฝ”๋“œ๋“ค์—์„œ `AutoTokenizer`, `AutoModel` ์„ ์‚ฌ์šฉํ•œ ๊ฒฝ์šฐ `.frompretrained("beomi/kcbert-base")` ๋ถ€๋ถ„์„ `.frompretrained("beomi/KcELECTRA-base")` ๋กœ๋งŒ ๋ณ€๊ฒฝํ•ด์ฃผ์‹œ๋ฉด ์ฆ‰์‹œ ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. - KcBERTํ•™์Šต์— ์‚ฌ์šฉํ•œ ๋ฐ์ดํ„ฐ + ์ดํ›„ 2021.03์›” ์ดˆ๊นŒ์ง€ ์ˆ˜์ง‘ํ•œ ๋Œ“๊ธ€ - ์•ฝ 17GB - ๋Œ“๊ธ€-๋Œ€๋Œ“๊ธ€์„ ๋ฌถ์€ ๊ธฐ๋ฐ˜์œผ๋กœ Document ๊ตฌ์„ฑ - https://github.com/KLUE-benchmark/KLUE-ELECTRA Repo๋ฅผ ํ†ตํ•œ Pretrain - https://github.com/Beomi/KcBERT-finetune Repo๋ฅผ ํ†ตํ•œ Finetune ๋ฐ ์Šค์ฝ”์–ด ๋น„๊ต ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” 2019.01.01 ~ 2021.03.09 ์‚ฌ์ด์— ์ž‘์„ฑ๋œ ๋Œ“๊ธ€ ๋งŽ์€ ๋‰ด์Šค/ํ˜น์€ ์ „์ฒด ๋‰ด์Šค ๊ธฐ์‚ฌ๋“ค์˜ ๋Œ“๊ธ€๊ณผ ๋Œ€๋Œ“๊ธ€์„ ๋ชจ๋‘ ์ˆ˜์ง‘ํ•œ ๋ฐ์ดํ„ฐ์ž…๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ ์‚ฌ์ด์ฆˆ๋Š” ํ…์ŠคํŠธ๋งŒ ์ถ”์ถœ์‹œ ์•ฝ 17.3GB์ด๋ฉฐ, 1์–ต8์ฒœ๋งŒ๊ฐœ ์ด์ƒ์˜ ๋ฌธ์žฅ์œผ๋กœ ์ด๋ค„์ ธ ์žˆ์Šต๋‹ˆ๋‹ค. > KcBERT๋Š” 2019.01-2020.06์˜ ํ…์ŠคํŠธ๋กœ, ์ •์ œ ํ›„ ์•ฝ 9์ฒœ๋งŒ๊ฐœ ๋ฌธ์žฅ์œผ๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋„ค์ด๋ฒ„ ๋Œ“๊ธ€์˜ ๊ฒฝ์šฐ, ๋น„์†์–ด๋Š” ์ž์ฒด ํ•„ํ„ฐ๋ง์„ ํ†ตํ•ด `OOO` ๋กœ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ถ€๋ถ„์„ ๊ณต๋ฐฑ์œผ๋กœ ์ œ๊ฑฐํ•˜์˜€์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ๋ช…๋ น์–ด๋กœ pip๋กœ ์„ค์น˜ํ•œ ๋’ค, ์•„๋ž˜ cleanํ•จ์ˆ˜๋กœ ํด๋ฆฌ๋‹์„ ํ•˜๋ฉด Downstream task์—์„œ ๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์ข‹์•„์ง‘๋‹ˆ๋‹ค. (`[UNK]` ๊ฐ์†Œ) Tokenizer๋Š” Huggingface์˜ Tokenizers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ†ตํ•ด ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ ์ค‘ `BertWordPieceTokenizer` ๋ฅผ ์ด์šฉํ•ด ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ๊ณ , Vocab Size๋Š” `30000`์œผ๋กœ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. Tokenizer๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์—๋Š” ์ „์ฒด ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ๊ณ , ๋ชจ๋ธ์˜ General Downstream task์— ๋Œ€์‘ํ•˜๊ธฐ ์œ„ํ•ด KoELECTRA์—์„œ ์‚ฌ์šฉํ•œ Vocab์„ ๊ฒน์น˜์ง€ ์•Š๋Š” ๋ถ€๋ถ„์„ ์ถ”๊ฐ€๋กœ ๋„ฃ์–ด์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค. (์‹ค์ œ๋กœ ๋‘ ๋ชจ๋ธ์ด ๊ฒน์น˜๋Š” ๋ถ€๋ถ„์€ ์•ฝ 5000ํ† ํฐ์ด์—ˆ์Šต๋‹ˆ๋‹ค.) TPU `v3-8` ์„ ์ด์šฉํ•ด ์•ฝ 10์ผ ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ๊ณ , ํ˜„์žฌ Huggingface์— ๊ณต๊ฐœ๋œ ๋ชจ๋ธ์€ 848k step์„ ํ•™์Šตํ•œ ๋ชจ๋ธ weight๊ฐ€ ์—…๋กœ๋“œ ๋˜์–ด์žˆ์Šต๋‹ˆ๋‹ค. (100k step๋ณ„ Checkpoint๋ฅผ ํ†ตํ•ด ์„ฑ๋Šฅ ํ‰๊ฐ€๋ฅผ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค. ํ•ด๋‹น ๋ถ€๋ถ„์€ `KcBERT-finetune` repo๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”.) ๋ชจ๋ธ ํ•™์Šต Loss๋Š” Step์— ๋”ฐ๋ผ ์ดˆ๊ธฐ 100-200k ์‚ฌ์ด์— ๊ธ‰๊ฒฉํžˆ Loss๊ฐ€ ์ค„์–ด๋“ค๋‹ค ํ•™์Šต ์ข…๋ฃŒ๊นŒ์ง€๋„ ์ง€์†์ ์œผ๋กœ loss๊ฐ€ ๊ฐ์†Œํ•˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. - ์œ„์™€ ๊ฐ™์ด KcBERT-base, KcBERT-large ๋Œ€๋น„ ๋ชจ๋“  ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•ด KcELECTRA-base๊ฐ€ ๋” ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์ž…๋‹ˆ๋‹ค. - KcELECTRA pretrain์—์„œ๋„ Train step์ด ๋Š˜์–ด๊ฐ์— ๋”ฐ๋ผ ์ ์ง„์ ์œผ๋กœ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. KcELECTRA Model์„ ํ•™์Šตํ•˜๋Š” GCP/TPU ํ™˜๊ฒฝ์€ TFRC ํ”„๋กœ๊ทธ๋žจ์˜ ์ง€์›์„ ๋ฐ›์•˜์Šต๋‹ˆ๋‹ค. - KcBERT by Beomi - BERT by Google - KoBERT by SKT - KoELECTRA by Monologg - Transformers by Huggingface - Tokenizers by Hugginface - ELECTRA train code by KLUE - Monologg๋‹˜์˜ KoELECTRA ํ•™์Šต๊ธฐ - Colab์—์„œ TPU๋กœ BERT ์ฒ˜์Œ๋ถ€ํ„ฐ ํ•™์Šต์‹œํ‚ค๊ธฐ - Tensorflow/Google ver.

license:mit
47,704
12

kcbert-base

- KcELECTRA๊ฐ€ ๋ฆด๋ฆฌ์ฆˆ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค!๐Ÿค— - KcELECTRA๋Š” ๋ณด๋‹ค ๋” ๋งŽ์€ ๋ฐ์ดํ„ฐ์…‹, ๊ทธ๋ฆฌ๊ณ  ๋” ํฐ General vocab์„ ํ†ตํ•ด KcBERT ๋Œ€๋น„ ๋ชจ๋“  ํƒœ์Šคํฌ์—์„œ ๋” ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์ž…๋‹ˆ๋‹ค. - ์•„๋ž˜ ๊นƒํ—™ ๋งํฌ์—์„œ ์ง์ ‘ ์‚ฌ์šฉํ•ด๋ณด์„ธ์š”! - https://github.com/Beomi/KcELECTRA - KcBERT Paper ์ธ์šฉ ํ‘œ๊ธฐ๋ฅผ ์ถ”๊ฐ€ํ•˜์˜€์Šต๋‹ˆ๋‹ค.(bibtex) - KcBERT-finetune Performance score๋ฅผ ๋ณธ๋ฌธ์— ์ถ”๊ฐ€ํ•˜์˜€์Šต๋‹ˆ๋‹ค. Huggingface Transformers๊ฐ€ v4.0.0์œผ๋กœ ์—…๋ฐ์ดํŠธ๋จ์— ๋”ฐ๋ผ Tutorial์˜ ์ฝ”๋“œ๊ฐ€ ์ผ๋ถ€ ๋ณ€๊ฒฝ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. KcBERT๋ฅผ Google Colab์—์„œ TPU๋ฅผ ํ†ตํ•ด ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋Š” ํŠœํ† ๋ฆฌ์–ผ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค! ์•„๋ž˜ ๋ฒ„ํŠผ์„ ๋ˆŒ๋Ÿฌ๋ณด์„ธ์š”. ๋งŒ์•ฝ ํ•œ ํŒŒ์ผ๋กœ ๋ฐ›๊ณ ์‹ถ์œผ์‹œ๊ฑฐ๋‚˜/Kaggle์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ดํŽด๋ณด๊ณ  ์‹ถ์œผ์‹œ๋‹ค๋ฉด ์•„๋ž˜์˜ ์บ๊ธ€ ๋ฐ์ดํ„ฐ์…‹์„ ์ด์šฉํ•ด์ฃผ์„ธ์š”. - Github๋ฆด๋ฆฌ์ฆˆ: https://github.com/Beomi/KcBERT/releases/tag/TrainDatav1 - ์บ๊ธ€: https://www.kaggle.com/junbumlee/kcbert-pretraining-corpus-korean-news-comments (ํ•œ ํŒŒ์ผ๋กœ ๋ฐ›์„ ์ˆ˜ ์žˆ์–ด์š”. ๋‹จ์ผํŒŒ์ผ) Kaggle์— ํ•™์Šต์„ ์œ„ํ•ด ์ •์ œํ•œ(์•„๋ž˜ `clean`์ฒ˜๋ฆฌ๋ฅผ ๊ฑฐ์นœ) Dataset์„ ๊ณต๊ฐœํ•˜์˜€์Šต๋‹ˆ๋‹ค! ๊ณต๊ฐœ๋œ ํ•œ๊ตญ์–ด BERT๋Š” ๋Œ€๋ถ€๋ถ„ ํ•œ๊ตญ์–ด ์œ„ํ‚ค, ๋‰ด์Šค ๊ธฐ์‚ฌ, ์ฑ… ๋“ฑ ์ž˜ ์ •์ œ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ํ•œํŽธ, ์‹ค์ œ๋กœ NSMC์™€ ๊ฐ™์€ ๋Œ“๊ธ€ํ˜• ๋ฐ์ดํ„ฐ์…‹์€ ์ •์ œ๋˜์ง€ ์•Š์•˜๊ณ  ๊ตฌ์–ด์ฒด ํŠน์ง•์— ์‹ ์กฐ์–ด๊ฐ€ ๋งŽ์œผ๋ฉฐ, ์˜คํƒˆ์ž ๋“ฑ ๊ณต์‹์ ์ธ ๊ธ€์“ฐ๊ธฐ์—์„œ ๋‚˜ํƒ€๋‚˜์ง€ ์•Š๋Š” ํ‘œํ˜„๋“ค์ด ๋นˆ๋ฒˆํ•˜๊ฒŒ ๋“ฑ์žฅํ•ฉ๋‹ˆ๋‹ค. KcBERT๋Š” ์œ„์™€ ๊ฐ™์€ ํŠน์„ฑ์˜ ๋ฐ์ดํ„ฐ์…‹์— ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด, ๋„ค์ด๋ฒ„ ๋‰ด์Šค์—์„œ ๋Œ“๊ธ€๊ณผ ๋Œ€๋Œ“๊ธ€์„ ์ˆ˜์ง‘ํ•ด, ํ† ํฌ๋‚˜์ด์ €์™€ BERT๋ชจ๋ธ์„ ์ฒ˜์Œ๋ถ€ํ„ฐ ํ•™์Šตํ•œ Pretrained BERT ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. KcBERT๋Š” Huggingface์˜ Transformers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ†ตํ•ด ๊ฐ„ํŽธํžˆ ๋ถˆ๋Ÿฌ์™€ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (๋ณ„๋„์˜ ํŒŒ์ผ ๋‹ค์šด๋กœ๋“œ๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.) - Finetune ์ฝ”๋“œ๋Š” https://github.com/Beomi/KcBERT-finetune ์—์„œ ์ฐพ์•„๋ณด์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. | | Size (์šฉ๋Ÿ‰) | NSMC (acc) | Naver NER (F1) | PAWS (acc) | KorNLI (acc) | KorSTS (spearman) | Question Pair (acc) | KorQuaD (Dev) (EM/F1) | | :-------------------- | :---: | :----------------: | :--------------------: | :----------------: | :------------------: | :-----------------------: | :-------------------------: | :---------------------------: | | KcBERT-Base | 417M | 89.62 | 84.34 | 66.95 | 74.85 | 75.57 | 93.93 | 60.25 / 84.39 | | KcBERT-Large | 1.2G | 90.68 | 85.53 | 70.15 | 76.99 | 77.49 | 94.06 | 62.16 / 86.64 | | KoBERT | 351M | 89.63 | 86.11 | 80.65 | 79.00 | 79.64 | 93.93 | 52.81 / 80.27 | | XLM-Roberta-Base | 1.03G | 89.49 | 86.26 | 82.95 | 79.92 | 79.09 | 93.53 | 64.70 / 88.94 | | HanBERT | 614M | 90.16 | 87.31 | 82.40 | 80.89 | 83.33 | 94.19 | 78.74 / 92.02 | | KoELECTRA-Base | 423M | 90.21 | 86.87 | 81.90 | 80.85 | 83.21 | 94.20 | 61.10 / 89.59 | | KoELECTRA-Base-v2 | 423M | 89.70 | 87.02 | 83.90 | 80.61 | 84.30 | 94.72 | 84.34 / 92.58 | | DistilKoBERT | 108M | 88.41 | 84.13 | 62.55 | 70.55 | 73.21 | 92.48 | 54.12 / 77.80 | \config์˜ ์„ธํŒ…์„ ๊ทธ๋Œ€๋กœ ํ•˜์—ฌ ๋Œ๋ฆฐ ๊ฒฐ๊ณผ์ด๋ฉฐ, hyperparameter tuning์„ ์ถ”๊ฐ€์ ์œผ๋กœ ํ•  ์‹œ ๋” ์ข‹์€ ์„ฑ๋Šฅ์ด ๋‚˜์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. KcBERT-Base NSMC Finetuning with PyTorch-Lightning (Colab) KcBERT-Large NSMC Finetuning with PyTorch-Lightning (Colab) > ์œ„ ๋‘ ์ฝ”๋“œ๋Š” Pretrain ๋ชจ๋ธ(base, large)์™€ batch size๋งŒ ๋‹ค๋ฅผ ๋ฟ, ๋‚˜๋จธ์ง€ ์ฝ”๋“œ๋Š” ์™„์ „ํžˆ ๋™์ผํ•ฉ๋‹ˆ๋‹ค. ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” 2019.01.01 ~ 2020.06.15 ์‚ฌ์ด์— ์ž‘์„ฑ๋œ ๋Œ“๊ธ€ ๋งŽ์€ ๋‰ด์Šค ๊ธฐ์‚ฌ๋“ค์˜ ๋Œ“๊ธ€๊ณผ ๋Œ€๋Œ“๊ธ€์„ ๋ชจ๋‘ ์ˆ˜์ง‘ํ•œ ๋ฐ์ดํ„ฐ์ž…๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ ์‚ฌ์ด์ฆˆ๋Š” ํ…์ŠคํŠธ๋งŒ ์ถ”์ถœ์‹œ ์•ฝ 15.4GB์ด๋ฉฐ, 1์–ต1์ฒœ๋งŒ๊ฐœ ์ด์ƒ์˜ ๋ฌธ์žฅ์œผ๋กœ ์ด๋ค„์ ธ ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ๋ช…๋ น์–ด๋กœ pip๋กœ ์„ค์น˜ํ•œ ๋’ค, ์•„๋ž˜ cleanํ•จ์ˆ˜๋กœ ํด๋ฆฌ๋‹์„ ํ•˜๋ฉด Downstream task์—์„œ ๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์ข‹์•„์ง‘๋‹ˆ๋‹ค. (`[UNK]` ๊ฐ์†Œ) ์›๋ณธ ๋ฐ์ดํ„ฐ๋ฅผ ์œ„ `clean`ํ•จ์ˆ˜๋กœ ์ •์ œํ•œ 12GB๋ถ„๋Ÿ‰์˜ txt ํŒŒ์ผ์„ ์•„๋ž˜ Kaggle Dataset์—์„œ ๋‹ค์šด๋ฐ›์œผ์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค :) Tokenizer๋Š” Huggingface์˜ Tokenizers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ†ตํ•ด ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ ์ค‘ `BertWordPieceTokenizer` ๋ฅผ ์ด์šฉํ•ด ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ๊ณ , Vocab Size๋Š” `30000`์œผ๋กœ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. Tokenizer๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์—๋Š” `1/10`๋กœ ์ƒ˜ํ”Œ๋งํ•œ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ๊ณ , ๋ณด๋‹ค ๊ณจ๊ณ ๋ฃจ ์ƒ˜ํ”Œ๋งํ•˜๊ธฐ ์œ„ํ•ด ์ผ์ž๋ณ„๋กœ stratify๋ฅผ ์ง€์ •ํ•œ ๋’ค ํ–‘์Šต์„ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. BERT Model Config๋Š” Base, Large ๊ธฐ๋ณธ ์„ธํŒ…๊ฐ’์„ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. (MLM 15% ๋“ฑ) TPU `v3-8` ์„ ์ด์šฉํ•ด ๊ฐ๊ฐ 3์ผ, N์ผ(Large๋Š” ํ•™์Šต ์ง„ํ–‰ ์ค‘)์„ ์ง„ํ–‰ํ–ˆ๊ณ , ํ˜„์žฌ Huggingface์— ๊ณต๊ฐœ๋œ ๋ชจ๋ธ์€ 1m(100๋งŒ) step์„ ํ•™์Šตํ•œ ckpt๊ฐ€ ์—…๋กœ๋“œ ๋˜์–ด์žˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ ํ•™์Šต Loss๋Š” Step์— ๋”ฐ๋ผ ์ดˆ๊ธฐ 200k์— ๊ฐ€์žฅ ๋น ๋ฅด๊ฒŒ Loss๊ฐ€ ์ค„์–ด๋“ค๋‹ค 400k์ดํ›„๋กœ๋Š” ์กฐ๊ธˆ์”ฉ ๊ฐ์†Œํ•˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•™์Šต์€ GCP์˜ TPU v3-8์„ ์ด์šฉํ•ด ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ๊ณ , ํ•™์Šต ์‹œ๊ฐ„์€ Base Model ๊ธฐ์ค€ 2.5์ผ์ •๋„ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. Large Model์€ ์•ฝ 5์ผ์ •๋„ ์ง„ํ–‰ํ•œ ๋’ค ๊ฐ€์žฅ ๋‚ฎ์€ loss๋ฅผ ๊ฐ€์ง„ ์ฒดํฌํฌ์ธํŠธ๋กœ ์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. HuggingFace kcbert-base ๋ชจ๋ธ ์—์„œ ์•„๋ž˜์™€ ๊ฐ™์ด ํ…Œ์ŠคํŠธ ํ•ด ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋„ค์ด๋ฒ„ ์˜ํ™”ํ‰ ์ฝ”ํผ์Šค ๋ฐ์ดํ„ฐ์…‹์„ ๋Œ€์ƒ์œผ๋กœ Fine Tuning์„ ์ง„ํ–‰ํ•ด ์„ฑ๋Šฅ์„ ๊ฐ„๋‹จํžˆ ํ…Œ์ŠคํŠธํ•ด๋ณด์•˜์Šต๋‹ˆ๋‹ค. - GPU๋Š” P100 x1๋Œ€ ๊ธฐ์ค€ 1epoch์— 2-3์‹œ๊ฐ„, TPU๋Š” 1epoch์— 1์‹œ๊ฐ„ ๋‚ด๋กœ ์†Œ์š”๋ฉ๋‹ˆ๋‹ค. - GPU RTX Titan x4๋Œ€ ๊ธฐ์ค€ 30๋ถ„/epoch ์†Œ์š”๋ฉ๋‹ˆ๋‹ค. - ์˜ˆ์‹œ ์ฝ”๋“œ๋Š” pytorch-lightning์œผ๋กœ ๊ฐœ๋ฐœํ–ˆ์Šต๋‹ˆ๋‹ค. - ๋…ผ๋ฌธ์ง‘ ๋‹ค์šด๋กœ๋“œ ๋งํฌ: http://hclt.kr/dwn/?v=bG5iOmNvbmZlcmVuY2U7aWR4OjMy (ํ˜น์€ http://hclt.kr/symp/?lnb=conference ) KcBERT Model์„ ํ•™์Šตํ•˜๋Š” GCP/TPU ํ™˜๊ฒฝ์€ TFRC ํ”„๋กœ๊ทธ๋žจ์˜ ์ง€์›์„ ๋ฐ›์•˜์Šต๋‹ˆ๋‹ค. - BERT by Google - KoBERT by SKT - KoELECTRA by Monologg - Transformers by Huggingface - Tokenizers by Hugginface - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding - Monologg๋‹˜์˜ KoELECTRA ํ•™์Šต๊ธฐ - Colab์—์„œ TPU๋กœ BERT ์ฒ˜์Œ๋ถ€ํ„ฐ ํ•™์Šต์‹œํ‚ค๊ธฐ - Tensorflow/Google ver.

license:apache-2.0
41,629
29

Llama-3-Open-Ko-8B

NaNK
llama
4,715
157

KcELECTRA-base

license:mit
3,677
41

llama-2-ko-7b

NaNK
llama
2,577
175

Llama-3-Open-Ko-8B-Instruct-preview

> Update @ 2024.05.01: Pre-Release Llama-3-KoEn-8B model & Llama-3-KoEn-8B-Instruct-preview > Update @ 2024.04.24: Release Llama-3-Open-Ko-8B model & Llama-3-Open-Ko-8B-Instruct-preview Llama-3-Open-Ko-8B model is continued pretrained language model based on Llama-3-8B. This model is trained fully with publicily available resource, with 60GB+ of deduplicated texts. With the new Llama-3 tokenizer, the pretraining conducted with 17.7B+ tokens, which slightly more than Korean tokenizer(Llama-2-Ko tokenizer). The train was done on TPUv5e-256, with the warm support from TRC program by Google. With applying the idea from Chat Vector paper, I released Instruction model named Llama-3-Open-Ko-8B-Instruct-preview. Since it is NOT finetuned with any Korean instruction set(indeed `preview`), but it would be great starting point for creating new Chat/Instruct models. ์—ฌ๋Ÿฌ๋ถ„๊ป˜์„œ๋Š” ๋ฌผ๋ก  ์•„์‹œ๋Š”์ง€๋ผ๋„ ์„ค๋ช…์„ ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค! ํ”ผ๋ณด๋‚˜์น˜(Piconacci) ์ˆ˜์—ด์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ˆ˜์—ด์„ ๋งํ•ฉ๋‹ˆ๋‹ค: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233,.... ์ด๋ ‡๊ฒŒ ๊ณ„์†๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์ˆ˜์—ด์€ ์ผ๋ฐ˜์ ์œผ๋กœ๋Š” ์ˆ˜ํ•™์—์„œ ๋ฌดํ•œํžˆ ์ง€์†๋ฉ๋‹ˆ๋‹ค. ๊ฐ ์ˆ˜๋Š” ์ด์ „ ์ˆ˜์˜ ๋‘ ๋ฐฐ๊ฐ€ ๋˜๋Š” ์ˆ˜์—ด์ž…๋‹ˆ๋‹ค. ์ด ์ˆ˜์—ด์„ ํŒŒ์ด์ฌ(Python)์œผ๋กœ ๊ตฌํ˜„ํ•˜๊ณ ์ž ํ•˜์‹ ๋‹ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์ฝ”๋“œ์—์„œ๋Š” `n`์ด ์ž…๋ ฅ๋ฐ›์€ ์ˆ˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๋ณ€์ˆ˜๋กœ, ํ”ผ๋ณด๋‚˜์น˜ ์ˆ˜์—ด์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค. ์ด ์ฝ”๋“œ๋ฅผ ์‹คํ–‰์‹œํ‚ค๋ฉด ์ž…๋ ฅ๋ฐ›์€ ์ˆ˜์— ๋”ฐ๋ผ ํ”ผ๋ณด๋‚˜์น˜ ์ˆ˜์—ด์˜ ํ•ด๋‹น ํ•ญ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, `fibonacci(10)` ํ•˜๋ฉด 55๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์ด ์ฝ”๋“œ๋Š” ์žฌ๊ท€์  ํ•จ์ˆ˜์ด๊ธฐ ๋•Œ๋ฌธ์— ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ์ด ์ค„์–ด๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์žฌ๊ท€์  ํ•จ์ˆ˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์ด ๋งŽ์•„์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ์ดํ„ฐ๋ ˆ์ดํ‹ฐ๋ธŒ ํ•จ์ˆ˜๋กœ ๊ตฌํ˜„ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์ฝ”๋“œ์—์„œ๋Š” ์ดํ„ฐ๋ ˆ์ดํ‹ฐ๋ธŒ ํ•จ์ˆ˜๋กœ ํ”ผ๋ณด๋‚˜์น˜ ์ˆ˜์—ด์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค. ์ด ์ฝ”๋“œ๋ฅผ ์‹คํ–‰์‹œํ‚ค๋ฉด ์ž…๋ ฅ๋ฐ›์€ ์ˆ˜์— ๋”ฐ๋ผ ํ”ผ๋ณด๋‚˜์น˜ ์ˆ˜์—ด์˜ ํ•ด๋‹น ํ•ญ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์ด ์ฝ”๋“œ๋Š” ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ์ค„์ž…๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ์ด ์ฝ”๋“œ๋Š” ๋” ๋ณต์žกํ•ฉ๋‹ˆ๋‹ค. ์ด ์ฝ”๋“œ๋ฅผ ๋” ๊ฐ„๋‹จํ•˜๊ฒŒ ํ•˜๋ ค๋ฉด ์ดํ„ฐ๋ ˆ์ดํ‹ฐ๋ธŒ ํ•จ์ˆ˜๋ฅผ ๋” ์ž˜ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. > I used same system prompt, but you could change on your own.

NaNK
llama
2,346
61

OPEN-SOLAR-KO-10.7B

- 2024.01.08: Initial Test version Release of Solar-Ko Solar-Ko represents an advanced iteration of the upstage/SOLAR-10.7B-v1.0 model, featuring an expanded vocabulary and the inclusion of a Korean corpus for enhanced pretraining. Open-Solar-Ko exclusively utilizes publicly accessible Korean corpora, including sources such as AI Hub, Modu Corpus, ๋ชจ๋‘์˜ ๋ง๋ญ‰์น˜, and Korean Wikipedia. As training was conducted solely with publicly available corpora, this model is open for unrestricted use by everyone, adhering to the Apache2.0 open source License. Variations: Solar-Ko is available with one parameter sizes โ€” 10B with Continual Pretrained version. Output: The model produces text output exclusively. SOLAR-KO-10.7B is an auto-regressive language model that leverages an optimized transformer architecture derived from Llama-2. | |Training Data|Parameters|Content Length|GQA|Tokens|Learning Rate| |---|---|---|---|---|---|---| |SOLAR-KO-10.7B|A curated mix of Publicly Accessible Korean Corpora|10.7B|4k|O|>15B|5e -5 | The model was trained using selected datasets from AIHub and Modu Corpus. Detailed information about the training datasets is available below: - AI Hub: corpus/AIHUB - Only the `Training` segment of the data was used. - The `Validation` and `Test` segments were deliberately excluded. - Modu Corpus: corpus/MODUCORPUS The final JSONL dataset used to train this model is approximately 61GB in size. Total token count: Approximately 15 billion tokens (using the expanded tokenizer. With the original SOLAR tokenizer, >60 billion tokens.) | Model Name | Vocabulary Size | Description | | --- | --- | --- | | Original Solar | 32000 | Sentencepiece BPE | | Expanded SOLAR-KO-10.7B | 46592 | Sentencepiece BPE. Added Korean vocab and merges | - SOLAR-10.7B: 26 tokens - SOLAR-KO-10.7b: 8 tokens | Model | Tokens | | --- | --- | | SOLAR-10.7B | `['โ–', '์•ˆ', ' ', ' ', ' ', 'ํ•˜', '์„ธ', '์š”', ',', 'โ–', '์˜ค', ' ', ' ', ' ', '์€', 'โ–', '๋‚ ', ' ', ' ', ' ', '๊ฐ€', 'โ–', '์ข‹', '๋„ค', '์š”', '.']` | | SOLAR-KO-10.7B | `['โ–์•ˆ๋…•', 'ํ•˜์„ธ์š”', ',', 'โ–์˜ค๋Š˜์€', 'โ–๋‚ ', '์”จ๊ฐ€', 'โ–์ข‹๋„ค์š”', '.']` | Tokenizing "Meet 10.7B Solar: Elevating Performance with Upstage Depth UP Scaling!" - SOLAR-10.7B: 22 tokens - SOLAR-KO-10.7b: 22 tokens | Model | Tokens | | --- | --- | | SOLAR-10.7B | `['โ–Meet', 'โ–', '1', '0', '.', '7', 'B', 'โ–Solar', ':', 'โ–E', 'lev', 'ating', 'โ–Performance', 'โ–with', 'โ–Up', 'stage', 'โ–Dep', 'th', 'โ–UP', 'โ–Scal', 'ing', '!']` | | SOLAR-KO-10.7B | `['โ–Meet', 'โ–', '1', '0', '.', '7', 'B', 'โ–Solar', ':', 'โ–E', 'lev', 'ating', 'โ–Performance', 'โ–with', 'โ–Up', 'stage', 'โ–Dep', 'th', 'โ–UP', 'โ–Scal', 'ing', '!']` | - Used EleutherAI's lm-evaluation-harness https://github.com/EleutherAI/lm-evaluation-harness/tree/polyglot | | 0 | 5 | 10 | 50 | |:---------------------------------|---------:|---------:|---------:|---------:| | kobestboolq (macrof1) | 0.853949 | 0.88098 | 0.898139 | 0.902354 | | kobestcopa (macrof1) | 0.804531 | 0.826736 | 0.837656 | 0.860899 | | kobesthellaswag (macrof1) | 0.507174 | 0.500983 | 0.487287 | 0.512182 | | kobestsentineg (macrof1) | 0.3517 | 0.972291 | 0.977321 | 0.984884 | | kohatespeech (macrof1) | 0.258111 | 0.403957 | 0.386808 | 0.462393 | | kohatespeechapeach (macrof1) | 0.337667 | 0.651697 | 0.705337 | 0.827757 | | kohatespeechgenbias (macrof1) | 0.124535 | 0.503464 | 0.498501 | 0.443218 | | korunsmile (f1) | 0.3814 | 0.356939 | 0.369989 | 0.296193 | | nsmc (acc) | 0.5356 | 0.87162 | 0.88654 | 0.89632 | | pawsxko (acc) | 0.5435 | 0.5245 | 0.5315 | 0.5385 | - Training support was provided by the TPU Research Cloud program. - The training corpus includes data from AI Hub, Modu Corpus, and Korean Wikipedia.

NaNK
llama
1,644
65

KoRWKV-6B

NaNK
license:mit
1,552
3

KoAlpaca-Polyglot-5.8B

NaNK
license:apache-2.0
1,184
66

KoAlpaca-Polyglot-12.8B

- Add Safetensor sharded model weight (max shard = 1GB) This model is a fine-tuned version of EleutherAI/polyglot-ko-12.8b on a KoAlpaca Dataset v1.1b Detail Codes are available at KoAlpaca Github Repository The following hyperparameters were used during training: - learningrate: 5e-05 - trainbatchsize: 1 - seed: 42 - distributedtype: multi-GPU (A100 80G) - numdevices: 4 - gradientaccumulationsteps: 64 - totaltrainbatchsize: 256 - totalevalbatchsize: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lrschedulertype: linear - numepochs: 2.0 - Transformers 4.28.1 - Pytorch 2.0.0+cu117 - Datasets 2.11.0 - Tokenizers 0.13.3

NaNK
license:apache-2.0
1,006
62

Yi-Ko-6B

NaNK
llama
1,001
38

Qwen2.5-7B-Instruct-kowiki-qa

NaNK
โ€”
832
8

gemma-ko-2b

NaNK
โ€”
643
42

kcbert-large

โ€”
471
7

KcELECTRA-small-v2022

license:mit
367
3

Llama-3-KoEn-8B

NaNK
llama
363
14

KoAlpaca-llama-1-7b

NaNK
llama
361
28

beep-KcELECTRA-base-hate

โ€”
267
2

kollama-7b

NaNK
llama
244
10

korean-hatespeech-multilabel

โ€”
238
6

gemma-ko-7b

NaNK
โ€”
75
50

open-llama-2-ko-7b

NaNK
llama
69
39

polyglot-ko-12.8b-safetensors

NaNK
license:apache-2.0
48
5

KoAlpaca-RealQA-Solar-Ko-Recovery-11B-Q8_0-GGUF

beomi/KoAlpaca-RealQA-Solar-Ko-Recovery-11B-Q80-GGUF This LoRA adapter was converted to GGUF format from `beomi/KoAlpaca-RealQA-Solar-Ko-Recovery-11B` via the ggml.ai's GGUF-my-lora space. Refer to the original adapter repository for more details. To know more about LoRA usage with llama.cpp server, refer to the llama.cpp server documentation.

NaNK
llama-cpp
45
0

Llama-3-KoEn-8B-Instruct-preview

NaNK
llama
40
22

Llama-3-KoEn-8B-xtuner-llava-preview

NaNK
license:cc-by-nc-sa-4.0
26
11

kollama-13b

NaNK
llama
21
17

KoAlpaca-KoRWKV-6B

NaNK
license:apache-2.0
21
7

korean-hatespeech-classifier

โ€”
21
0

KoRWKV-1.5B

NaNK
license:mit
19
13

kcgpt2

โ€”
19
1

EXAONE-3.5-2.4B-Instruct-Llamafied

NaNK
llama
18
6

llama-2-koen-13b

NaNK
llama
17
36

kobert

โ€”
13
2

kcbert-large-dev

โ€”
13
0

Yi-Ko-DUS-9B

NaNK
llama
12
8

KoAlpaca-KoRWKV-1.5B

NaNK
license:apache-2.0
11
7

Yi-Ko-34B-Chat-Preview

NaNK
llama
11
2

KoAlpaca-RealQA-Solar-Ko-Recovery-11B-LoRA-ChatML-Q8_0-GGUF

beomi/KoAlpaca-RealQA-Solar-Ko-Recovery-11B-LoRA-ChatML-Q80-GGUF This LoRA adapter was converted to GGUF format from `beomi/KoAlpaca-RealQA-Solar-Ko-Recovery-11B-LoRA-ChatML` via the ggml.ai's GGUF-my-lora space. Refer to the original adapter repository for more details. To know more about LoRA usage with llama.cpp server, refer to the llama.cpp server documentation.

NaNK
llama
11
1

Yi-Ko-34B

NaNK
llama
9
7

kcbert-base-dev

โ€”
9
0

Solar-Ko-Recovery-11B-Q8_0-GGUF

NaNK
llama-cpp
9
0

KoAlpaca-RealQA-Solar-Ko-Recovery-11B-LoRA-ChatML-F16-GGUF

NaNK
llama
9
0

SOLAR-KOEN-10.8B

NaNK
llama
8
12

qlora-koalpaca-polyglot-12.8b-50step

NaNK
โ€”
8
1

Qwen2.5-7B-Instruct-kowiki-qa-context

NaNK
โ€”
7
6

EXAONE-3.5-32B-Instruct-Llamafied

NaNK
llama
7
6

EXAONE-3.5-7.8B-Instruct-Llamafied

NaNK
llama
7
5

KcBERT-v2023

license:mit
6
0

Solar-Ko-Recovery-11B

NaNK
llama
5
24

kykim-gpt3-kor-small_based_on_gpt2

โ€”
4
8

polyglot-ko-12.8b-safetensors-8bit

NaNK
license:apache-2.0
4
2

beep-kcbert-base-hate

โ€”
4
0

KoAlpaca-RealQA-Solar-Ko-Recovery-11B-Merged

NaNK
llama
3
5

detox-kcbert-base

โ€”
3
0

beep-KR-Medium-hate

โ€”
2
0

beep-KcELECTRA-base-bias

โ€”
2
0

distilbert-base-uncased-finetuned-cola

license:apache-2.0
2
0

exKcBERT-kowiki

โ€”
2
0

korean-lgbt-hatespeech-classifier

โ€”
2
0

Mistral-Ko-Inst-dev

license:apache-2.0
1
4

llama-2-ko-7b-emb-dev

NaNK
llama
1
1

beep-kcbert-base-bias

โ€”
1
0

beep-klue-roberta-base-bias

โ€”
1
0

beep-klue-roberta-base-hate

โ€”
1
0

beep-koelectra-base-v3-discriminator-bias

โ€”
1
0

beep-koelectra-base-v3-discriminator-hate

โ€”
1
0

exKcBERT-paws-extonly

โ€”
1
0

exKcBERT-paws

โ€”
1
0

kcgpt2-dev

โ€”
1
0

llama-2-ko-70b

NaNK
llama
0
38

KoAlpaca-65B-LoRA

NaNK
llama
0
12

kollama-33b

NaNK
llama
0
9

KoAlpaca-13B-LoRA

NaNK
llama
0
8

KoAlpaca-30B-LoRA

NaNK
llama
0
4

Llama-2-ko-7b-Chat-q4f16_1

NaNK
llama-2-ko
0
3

KcT5-dev

license:mit
0
2

Llama-3-Infini-1M

โ€”
0
2

KcT5

license:mit
0
1