RNA BERTa9700

10
2
1 language
by
IlPakoZ
Language Model
OTHER
New
10 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary

RNA-BERTa is a lightweight BERT model trained following the RoBERTa approach.

Code Examples

How to Get Started with the Modelpythontransformers
from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")
How to Get Started with the Modelpythontransformers
from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")
How to Get Started with the Modelpythontransformers
from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")
How to Get Started with the Modelpythontransformers
from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")
How to Get Started with the Modelpythontransformers
from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")
How to Get Started with the Modelpythontransformers
from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")
How to Get Started with the Modelpythontransformers
from transformers import RobertaForMaskedLM, RobertaTokenizerFast, RobertaModel

# Load with MLM head
model = RobertaForMaskedLM.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")

# Alternatively, load only the encoder for downstream tasks
encoder = RobertaModel.from_pretrained("IlPakoZ/RNA-BERTa9700")
tokenizer = RobertaTokenizerFast.from_pretrained("IlPakoZ/RNA-BERTa9700")
Citationbibtext
@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}
Citationbibtext
@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}
Citationbibtext
@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}
Citationbibtext
@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}
Citationbibtext
@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}
Citationbibtext
@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}
Citationbibtext
@article{10.48550/arXiv.2203.15556,
  title={Training compute-optimal large language models},
  author={Hoffmann, Jordan and Borgeaud, Sebastian and Mensch, Arthur and Buchatskaya, Elena and Cai, Trevor and Rutherford, Eliza and Casas, Diego de Las and Hendricks, Lisa Anne and Welbl, Johannes and Clark, Aidan and others},
  journal={arXiv preprint arXiv:2203.15556},
  year={2022}
},
@article{10.1093/nar/gkaa921,
  title={RNAcentral 2021: secondary structure integration, improved sequence search and new member databases},
  journal={Nucleic acids research},
  volume={49},
  number={D1},
  pages={D212--D220},
  year={2021},
  publisher={Oxford University Press}
},
@article{10.1093/nar/gkae979,
  title={Database resources of the National Center for Biotechnology Information in 2025},
  author={Sayers, Eric W and Beck, Jeffrey and Bolton, Evan E and Brister, J Rodney and Chan, Jessica and Connor, Ryan and Feldgarden, Michael and Fine, Anna M and Funk, Kathryn and Hoffman, Jinna and others},
  journal={Nucleic acids research},
  volume={53},
  number={D1},
  pages={D20--D29},
  year={2025},
  publisher={Oxford University Press}
},
@article{10.48550/arXiv.2203.03466,
  title={Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer},
  author={Yang, Greg and Hu, Edward J and Babuschkin, Igor and Sidor, Szymon and Liu, Xiaodong and Farhi, David and Ryder, Nick and Pachocki, Jakub and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2203.03466},
  year={2022}
},

@article{10.48550/arXiv.2112.11446,
  title={Scaling language models: Methods, analysis \& insights from training gopher},
  author={Rae, Jack W and Borgeaud, Sebastian and Cai, Trevor and Millican, Katie and Hoffmann, Jordan and Song, Francis and Aslanides, John and Henderson, Sarah and Ring, Roman and Young, Susannah and others},
  journal={arXiv preprint arXiv:2112.11446},
  year={2021}
},

@article{10.48550/arXiv.2407.17465,
  title={u-$\mu$P: The Unit-Scaled Maximal Update Parametrization},
  author={Blake, Charlie and Eichenberg, Constantin and Dean, Josef and Balles, Lukas and Prince, Luke Y and Deiseroth, Bj{\"o}rn and Cruz-Salinas, Andres Felipe and Luschi, Carlo and Weinbach, Samuel and Orr, Douglas},
  journal={arXiv preprint arXiv:2407.17465},
  year={2024}
}

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.