T5 Small

Downloads
Hugging Face
2.7M
502
Context
Small context
512
License
license:apache-2.0
Updated
11/3/2025
by
google-t5

T5 Small is a multilingual model that supports English, French, Romanian, and German. It is designed for tasks such as summarization and translation, and is licensed under Apache 2.0. The model is trained on the C4 dataset.

Language Model
OTHER

Quick Info

Released
3/2/2022
Framework
OTHER

Resources

Training Data Analysis

🔵 Good (6.0/10)

Researched training datasets used by T5 Small with quality assessment

Specialized For

general
multilingual

Training Datasets (1)

c4
🔵 6/10
general
multilingual
Key Strengths
  • Scale and Accessibility: 750GB of publicly available, filtered text
  • Systematic Filtering: Documented heuristics enable reproducibility
  • Language Diversity: Despite English-only, captures diverse writing styles
Considerations
  • English-Only: Limits multilingual applications
  • Filtering Limitations: Offensive content and low-quality text remain despite filtering

Explore our comprehensive training dataset analysis

View All Datasets