RNA BERTa9700

10

2

1 language

—

by

IlPakoZ

Language Model

OTHER

New

10 downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

Unknown

Mobile

Laptop

Server

Quick Summary

RNA-BERTa is a lightweight BERT model trained following the RoBERTa approach.

Code Examples

How to Get Started with the Modelpythontransformers

from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

How to Get Started with the Modelpythontransformers

from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

How to Get Started with the Modelpythontransformers

from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

How to Get Started with the Modelpythontransformers

from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

How to Get Started with the Modelpythontransformers

from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

How to Get Started with the Modelpythontransformers

from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

How to Get Started with the Modelpythontransformers

from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

Citationbibtext

@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}

Citationbibtext

@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}

Citationbibtext

@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}

Citationbibtext

@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}

Citationbibtext

@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}

Citationbibtext

@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}

Citationbibtext

@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.