pythia-14m-deduped
62.1K
28
license:apache-2.0
by
EleutherAI
Language Model
OTHER
Fair
62K downloads
Community-tested
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Training Data Analysis
🟢 Excellent (8.0/10)
Researched training datasets used by pythia-14m-deduped with quality assessment
Specialized For
code
general
science
multilingual
Training Datasets (1)
the pile
🟢 8/10
code
general
science
multilingual
Key Strengths
- •Deliberate Diversity: Explicitly curated to include diverse content types (academia, code, Q&A, book...
- •Documented Quality: Each component dataset is thoroughly documented with rationale for inclusion, en...
- •Epoch Weighting: Component datasets receive different training epochs based on perceived quality, al...
Explore our comprehensive training dataset analysis
View All DatasetsCode Examples
Quickstartpythontransformers
from transformers import GPTNeoXForCausalLM, AutoTokenizer
model = GPTNeoXForCausalLM.from_pretrained(
"EleutherAI/pythia-14m-deduped",
revision="step3000",
cache_dir="./pythia-14m-deduped/step3000",
)
tokenizer = AutoTokenizer.from_pretrained(
"EleutherAI/pythia-14m-deduped",
revision="step3000",
cache_dir="./pythia-14m-deduped/step3000",
)
inputs = tokenizer("Hello, I am", return_tensors="pt")
tokens = model.generate(**inputs)
tokenizer.decode(tokens[0])Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.