prajjwal1
bert-tiny
The following model is a Pytorch pre-trained model obtained from converting Tensorflow checkpoint found in the official Google BERT repository. This is one of the smaller pre-trained BERT variants, together with bert-mini bert-small and bert-medium. They were introduced in the study `Well-Read Students Learn Better: On the Importance of Pre-training Compact Models` (arxiv), and ported to HF for the study `Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics` (arXiv). These models are supposed to be trained on a downstream task. If you use the model, please consider citing both the papers: Config of this model: - `prajjwal1/bert-tiny` (L=2, H=128) Model Link Other models to check out: - `prajjwal1/bert-mini` (L=4, H=256) Model Link - `prajjwal1/bert-small` (L=4, H=512) Model Link - `prajjwal1/bert-medium` (L=8, H=512) Model Link Original Implementation and more info can be found in this Github repository.