prajjwal1

33 models • 1 total models in database
Sort by:

bert-tiny

The following model is a Pytorch pre-trained model obtained from converting Tensorflow checkpoint found in the official Google BERT repository. This is one of the smaller pre-trained BERT variants, together with bert-mini bert-small and bert-medium. They were introduced in the study `Well-Read Students Learn Better: On the Importance of Pre-training Compact Models` (arxiv), and ported to HF for the study `Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics` (arXiv). These models are supposed to be trained on a downstream task. If you use the model, please consider citing both the papers: Config of this model: - `prajjwal1/bert-tiny` (L=2, H=128) Model Link Other models to check out: - `prajjwal1/bert-mini` (L=4, H=256) Model Link - `prajjwal1/bert-small` (L=4, H=512) Model Link - `prajjwal1/bert-medium` (L=8, H=512) Model Link Original Implementation and more info can be found in this Github repository.

12,868,604
132

bert-mini

license:mit
88,405
23

bert-small

license:mit
26,336
26

bert-medium

license:mit
5,134
5

bert-medium-mnli

469
1

bert-tiny-mnli

197
4

ctrl_discovery_6

7
0

ctrl_discovery_flipped_4

6
0

bert-mini-mnli

5
0

ctrl_discovery_10

5
0

ctrl_discovery_11

5
0

ctrl_discovery_flipped_3

5
0

albert-base-v2-mnli

4
0

bert-small-mnli

4
0

ctrl_discovery_8

4
0

ctrl_discovery_flipped_2

4
0

ctrl_discovery_flipped_5

4
0

ctrl_discovery_1

3
0

ctrl_discovery_7

3
0

roberta-large-mnli

3
0

roberta-base-mnli

2
1

albert-base-v1-mnli

2
0

ctrl_discovery_3

2
0

ctrl_discovery_9

2
0

albert_new

1
0

ctrl_discovery_12

1
0

ctrl_discovery_14

1
0

ctrl_discovery_4

1
0

ctrl_discovery_flipped_1

1
0

ctrl_discovery_flipped_6

1
0

gpt2_xl_discovery

1
0

roberta_new

1
0

roberta_hellaswag

0
1