unicamp-dl
ptt5-base-portuguese-vocab
Introduction PTT5 is a T5 model pretrained in the BrWac corpus, a large collection of web pages in Portuguese, improving T5's performance on Portuguese sentence similarity and entailment tasks. It's available in three sizes (small, base and large) and two vocabularies (Google's T5 original and ours, trained on Portuguese Wikipedia). For further information or requests, please go to PTT5 repository. Available models | Model | Size | #Params | Vocabulary | | :-: | :-: | :-: | :-: | | unicamp-dl/ptt5-small-t5-vocab | small | 60M | Google's T5 | | unicamp-dl/ptt5-base-t5-vocab | base | 220M | Google's T5 | | unicamp-dl/ptt5-large-t5-vocab | large | 740M | Google's T5 | | unicamp-dl/ptt5-small-portuguese-vocab | small | 60M | Portuguese | | unicamp-dl/ptt5-base-portuguese-vocab (Recommended) | base | 220M | Portuguese | | unicamp-dl/ptt5-large-portuguese-vocab | large | 740M | Portuguese | @article{ptt52020, title={PTT5: Pretraining and validating the T5 model on Brazilian Portuguese data}, author={Carmo, Diedre and Piau, Marcos and Campiotti, Israel and Nogueira, Rodrigo and Lotufo, Roberto}, journal={arXiv preprint arXiv:2008.09144}, year={2020} }
translation-en-pt-t5
InRanker-base
InRanker-small
translation-pt-en-t5
ptt5-v2-base
ptt5-small-portuguese-vocab
mMiniLM-L6-v2-mmarco-v2
mMiniLM-L6-v2-pt-v2
ptt5-v2-3b
mt5-base-mmarco-v2
ptt5-base-en-pt-msmarco-100k-v2
ptt5-large-portuguese-vocab
MMiniLM L6 V2 En Pt Msmarco V2
mMiniLM-L6-v2 Reranker finetuned on mMARCO Introduction mMiniLM-L6-v2-en-pt-msmarco-v2 is a multilingual miniLM-based model finetuned on a bilingual version of MS MARCO passage dataset. This bilingual dataset version is formed by the original MS MARCO dataset (in English) and a Portuguese translated version. In the v2 version, the Portuguese dataset was translated using Google Translate. Further information about the dataset or the translation method can be found on our mMARCO: A Multilingual...