hfl

133 models • 4 total models in database
Sort by:

chinese-roberta-wwm-ext

--- language: - zh tags: - bert license: "apache-2.0" ---

license:apache-2.0
476,377
367

chinese-bert-wwm

license:apache-2.0
52,446
90

chinese-bert-wwm-ext

Chinese BERT with Whole Word Masking For further accelerating Chinese natural language processing, we provide Chinese pre-trained BERT with Whole Word Masking. Pre-Training with Whole Word Masking for Chinese BERT Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu This repository is developed based on:https://github.com/google-research/bert You may also interested in, - Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm - Chinese MacBERT: https://github.com/ymcui/MacBERT - Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA - Chinese XLNet: https://github.com/ymcui/Chinese-XLNet - Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer More resources by HFL: https://github.com/ymcui/HFL-Anthology Citation If you find the technical report or resource is useful, please cite the following technical report in your paper. - Primary: https://arxiv.org/abs/2004.13922

license:apache-2.0
10,075
186

chinese-macbert-large

license:apache-2.0
9,255
48

llama-3-chinese-8b-instruct-v3

NaNK
llama
9,018
63

chinese-roberta-wwm-ext-large

Please use 'Bert' related functions to load this model! Chinese BERT with Whole Word Masking For further accelerating Chinese natural language processing, we provide Chinese pre-trained BERT with Whole Word Masking. Pre-Training with Whole Word Masking for Chinese BERT Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu This repository is developed based on:https://github.com/google-research/bert You may also interested in, - Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm - Chinese MacBERT: https://github.com/ymcui/MacBERT - Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA - Chinese XLNet: https://github.com/ymcui/Chinese-XLNet - Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer More resources by HFL: https://github.com/ymcui/HFL-Anthology Citation If you find the technical report or resource is useful, please cite the following technical report in your paper. - Primary: https://arxiv.org/abs/2004.13922

license:apache-2.0
8,876
221

chinese-alpaca-2-13b-16k

NaNK
llama
8,691
29

chinese-alpaca-2-13b

NaNK
llama
8,685
83

chinese-mixtral

license:apache-2.0
8,597
3

chinese-llama-2-13b-16k

NaNK
llama
8,578
15

chinese-mixtral-instruct

license:apache-2.0
8,569
20

chinese-llama-2-13b

NaNK
llama
8,523
34

llama-3-chinese-8b-instruct

NaNK
llama
8,005
12

llama-3-chinese-8b-instruct-v2

NaNK
llama
7,973
39

chinese-macbert-base

Please use 'Bert' related functions to load this model! This repository contains the resources in our paper "Revisiting Pre-trained Models for Chinese Natural Language Processing", which will be published in "Findings of EMNLP". You can read our camera-ready paper through ACL Anthology or arXiv pre-print. Revisiting Pre-trained Models for Chinese Natural Language Processing Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu You may also interested in, - Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm - Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA - Chinese XLNet: https://github.com/ymcui/Chinese-XLNet - Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer More resources by HFL: https://github.com/ymcui/HFL-Anthology Introduction MacBERT is an improved BERT with novel MLM as correction pre-training task, which mitigates the discrepancy of pre-training and fine-tuning. Instead of masking with [MASK] token, which never appears in the fine-tuning stage, we propose to use similar words for the masking purpose. A similar word is obtained by using Synonyms toolkit (Wang and Hu, 2017), which is based on word2vec (Mikolov et al., 2013) similarity calculations. If an N-gram is selected to mask, we will find similar words individually. In rare cases, when there is no similar word, we will degrade to use random word replacement. Here is an example of our pre-training task. | | Example | | -------------- | ----------------- | | Original Sentence | we use a language model to predict the probability of the next word. | | MLM | we use a language [M] to [M] ##di ##ct the pro [M] ##bility of the next word . | | Whole word masking | we use a language [M] to [M] [M] [M] the [M] [M] [M] of the next word . | | N-gram masking | we use a [M] [M] to [M] [M] [M] the [M] [M] [M] [M] [M] next word . | | MLM as correction | we use a text system to ca ##lc ##ulate the po ##si ##bility of the next word . | Except for the new pre-training task, we also incorporate the following techniques. - Whole Word Masking (WWM) - N-gram masking - Sentence-Order Prediction (SOP) Note that our MacBERT can be directly replaced with the original BERT as there is no differences in the main neural architecture. For more technical details, please check our paper: Revisiting Pre-trained Models for Chinese Natural Language Processing Citation If you find our resource or paper is useful, please consider including the following citation in your paper. - https://arxiv.org/abs/2004.13922

license:apache-2.0
7,875
145

rbt3

license:apache-2.0
5,098
37

minirbt-h256

license:apache-2.0
4,824
9

chinese-electra-180g-small-ex-discriminator

license:apache-2.0
4,325
8

Qwen2.5-VL-7B-Instruct-GPTQ-Int4

NaNK
license:apache-2.0
988
7

llama-3-chinese-8b

NaNK
llama
754
14

llama-3-chinese-8b-instruct-v3-gguf

NaNK
base_model:hfl/llama-3-chinese-8b-instruct-v3
644
75

llama-3-chinese-8b-gguf

NaNK
license:apache-2.0
568
8

rbt4-h312

license:apache-2.0
550
6

Qwen2.5-VL-3B-Instruct-GPTQ-Int4

NaNK
license:apache-2.0
450
3

chinese-mixtral-instruct-gguf

license:apache-2.0
434
12

chinese-xlnet-base

license:apache-2.0
406
31

chinese-alpaca-2-13b-gguf

NaNK
license:apache-2.0
384
10

llama-3-chinese-8b-instruct-gguf

NaNK
license:apache-2.0
338
26

chinese-lert-base

license:apache-2.0
331
16

chinese-llama-2-1.3b

NaNK
llama
312
19

chinese-electra-base-discriminator

license:apache-2.0
277
10

chinese-electra-180g-large-discriminator

license:apache-2.0
264
6

chinese-lert-small

license:apache-2.0
259
14

chinese-llama-2-7b

NaNK
llama
248
102

chinese-electra-180g-base-discriminator

license:apache-2.0
233
13

chinese-alpaca-2-1.3b

NaNK
llama
212
8

minirbt-h288

license:apache-2.0
169
10

chinese-legal-electra-base-discriminator

license:apache-2.0
166
3

chinese-electra-small-discriminator

license:apache-2.0
138
3

llama-3-chinese-8b-instruct-v2-gguf

NaNK
license:apache-2.0
134
16

chinese-electra-180g-small-discriminator

license:apache-2.0
130
26

rbt6

license:apache-2.0
121
11

chinese-alpaca-2-7b-64k

NaNK
llama
120
5

chinese-llama-2-7b-64k

NaNK
llama
118
3

chinese-llama-2-lora-7b-64k

NaNK
llama
117
1

chinese-alpaca-2-lora-7b-64k

NaNK
llama
117
1

chinese-alpaca-2-1.3b-gguf

NaNK
license:apache-2.0
116
6

chinese-alpaca-2-1.3b-rlhf-gguf

NaNK
license:apache-2.0
116
5

chinese-alpaca-2-7b-64k-gguf

NaNK
license:apache-2.0
114
5

chinese-alpaca-2-13b-16k-gguf

NaNK
license:apache-2.0
109
1

chinese-alpaca-2-7b

NaNK
llama
104
161

chinese-llama-2-13b-gguf

NaNK
llama
104
5

chinese-mixtral-gguf

license:apache-2.0
101
6

chinese-llama-2-7b-gguf

NaNK
llama
101
5

rbtl3

license:apache-2.0
83
4

chinese-alpaca-2-7b-gguf

NaNK
license:apache-2.0
82
16

chinese-electra-small-ex-discriminator

license:apache-2.0
82
4

chinese-alpaca-2-lora-7b-16k

NaNK
llama
81
1

chinese-electra-180g-base-generator

license:apache-2.0
81
0

chinese-alpaca-2-7b-16k

NaNK
llama
80
18

chinese-llama-2-7b-16k

NaNK
llama
80
12

chinese-alpaca-2-1.3b-rlhf

NaNK
llama
79
2

chinese-llama-2-lora-7b-16k

NaNK
llama
79
1

chinese-electra-small-generator

license:apache-2.0
78
1

chinese-electra-180g-small-generator

license:apache-2.0
77
4

chinese-llama-2-lora-13b-16k

NaNK
llama
77
3

chinese-electra-small-ex-generator

license:apache-2.0
77
0

chinese-legal-electra-large-discriminator

license:apache-2.0
76
4

chinese-alpaca-2-7b-rlhf

NaNK
llama
74
2

chinese-electra-180g-large-generator

license:apache-2.0
71
0

chinese-alpaca-2-7b-rlhf-gguf

NaNK
license:apache-2.0
70
5

chinese-electra-base-generator

license:apache-2.0
68
0

chinese-alpaca-2-lora-13b-16k

NaNK
llama
67
3

chinese-electra-large-discriminator

license:apache-2.0
65
1

chinese-llama-2-7b-64k-gguf

NaNK
license:apache-2.0
64
2

chinese-electra-large-generator

license:apache-2.0
64
0

chinese-llama-2-13b-16k-gguf

NaNK
license:apache-2.0
62
1

chinese-electra-180g-small-ex-generator

license:apache-2.0
58
2

cino-large-v2

license:apache-2.0
55
13

chinese-lert-large

license:apache-2.0
53
16

chinese-xlnet-mid

license:apache-2.0
49
10

rbt4

license:apache-2.0
47
6

chinese-pert-base

license:cc-by-nc-sa-4.0
43
13

cino-base-v2

license:apache-2.0
40
5

chinese-llama-2-1.3b-gguf

NaNK
license:apache-2.0
40
2

chinese-pert-large

license:cc-by-nc-sa-4.0
38
11

chinese-legal-electra-small-discriminator

license:apache-2.0
36
1

chinese-alpaca-2-7b-16k-gguf

NaNK
license:apache-2.0
36
1

chinese-pert-large-mrc

license:apache-2.0
35
10

vle-base

license:apache-2.0
35
4

chinese-llama-2-7b-16k-gguf

NaNK
license:apache-2.0
34
2

Qwen2.5-VL-7B-Instruct-GPTQ-Int3

NaNK
license:apache-2.0
34
1

cino-large

license:apache-2.0
33
9

chinese-legal-electra-base-generator

license:apache-2.0
33
6

cino-small-v2

license:apache-2.0
33
6

Qwen2.5-VL-3B-Instruct-GPTQ-Int3

NaNK
license:apache-2.0
33
1

chinese-legal-electra-large-generator

license:apache-2.0
32
7

english-pert-large

license:cc-by-nc-sa-4.0
31
3

vle-base-for-vcr-qa2r

license:apache-2.0
31
1

vle-base-for-vqa

license:apache-2.0
30
1

vle-base-for-vcr-q2a

license:apache-2.0
30
1

vle-large-for-vcr-qa2r

license:apache-2.0
29
1

chinese-legal-electra-small-generator

license:apache-2.0
28
4

vle-large

license:apache-2.0
28
3

chinese-pert-base-mrc

license:apache-2.0
27
12

vle-large-for-vqa

license:apache-2.0
27
1

vle-large-for-vcr-q2a

license:apache-2.0
27
1

english-pert-base

license:cc-by-nc-sa-4.0
25
6

chinese-alpaca-lora-7b

NaNK
license:apache-2.0
0
68

chinese-llama-lora-7b

NaNK
license:apache-2.0
0
60

chinese-alpaca-lora-13b

NaNK
license:apache-2.0
0
57

chinese-alpaca-plus-lora-7b

NaNK
license:apache-2.0
0
38

chinese-alpaca-plus-lora-13b

NaNK
license:apache-2.0
0
33

chinese-llama-plus-lora-7b

NaNK
license:apache-2.0
0
27

chinese-llama-lora-13b

NaNK
license:apache-2.0
0
25

chinese-alpaca-2-lora-7b

NaNK
license:apache-2.0
0
17

chinese-llama-plus-lora-13b

NaNK
license:apache-2.0
0
16

chinese-llama-2-lora-7b

NaNK
license:apache-2.0
0
14

chinese-alpaca-2-lora-13b

NaNK
license:apache-2.0
0
12

chinese-alpaca-lora-33b

NaNK
license:apache-2.0
0
10

chinese-llama-lora-33b

NaNK
license:apache-2.0
0
8

llama-3-chinese-8b-lora

NaNK
base_model:meta-llama/Meta-Llama-3-8B
0
8

chinese-alpaca-pro-lora-33b

NaNK
license:apache-2.0
0
7

llama-3-chinese-8b-instruct-v2-lora

NaNK
base_model:meta-llama/Meta-Llama-3-8B-Instruct
0
5

chinese-alpaca-pro-lora-7b

NaNK
license:apache-2.0
0
4

chinese-llama-2-lora-13b

NaNK
license:apache-2.0
0
4

llama-3-chinese-8b-instruct-lora

NaNK
base_model:hfl/llama-3-chinese-8b
0
3

chinese-alpaca-plus-lora-33b

NaNK
license:apache-2.0
0
2

chinese-llama-alpaca-2-awq

license:apache-2.0
0
2

chinese-mixtral-lora

NaNK
license:apache-2.0
0
2

chinese-mixtral-instruct-lora

NaNK
license:apache-2.0
0
2

chinese-llama-plus-lora-33b

NaNK
license:apache-2.0
0
1

chinese-alpaca-pro-lora-13b

NaNK
license:apache-2.0
0
1