gpt-2-70m
932
6
1.0B
1 language
license:apache-2.0
by
codelion
Language Model
OTHER
1B params
New
932 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
3GB+ RAM
Mobile
Laptop
Server
Quick Summary
A 70M parameter GPT-2 model trained on 1 billion tokens using an optimized 50-30-20 dataset mixing strategy.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
1GB+ RAM
Code Examples
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("codelion/gpt-2-70m")
model = AutoModelForCausalLM.from_pretrained("codelion/gpt-2-70m")
# Generate text with better sampling parameters
inputs = tokenizer("The future of AI is", return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=50,
do_sample=True, # Enable sampling
temperature=0.8, # Control randomness
top_p=0.9, # Nucleus sampling
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0]))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("codelion/gpt-2-70m")
model = AutoModelForCausalLM.from_pretrained("codelion/gpt-2-70m")
# Generate text with better sampling parameters
inputs = tokenizer("The future of AI is", return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=50,
do_sample=True, # Enable sampling
temperature=0.8, # Control randomness
top_p=0.9, # Nucleus sampling
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0]))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("codelion/gpt-2-70m")
model = AutoModelForCausalLM.from_pretrained("codelion/gpt-2-70m")
# Generate text with better sampling parameters
inputs = tokenizer("The future of AI is", return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=50,
do_sample=True, # Enable sampling
temperature=0.8, # Control randomness
top_p=0.9, # Nucleus sampling
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0]))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("codelion/gpt-2-70m")
model = AutoModelForCausalLM.from_pretrained("codelion/gpt-2-70m")
# Generate text with better sampling parameters
inputs = tokenizer("The future of AI is", return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=50,
do_sample=True, # Enable sampling
temperature=0.8, # Control randomness
top_p=0.9, # Nucleus sampling
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0]))Citationbibtex
@article{sharma2025billion,
title={The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix},
author={Sharma, Asankhaya},
year={2025},
url={https://huggingface.co/blog/codelion/optimal-dataset-mixing/}
}Citationbibtex
@article{sharma2025billion,
title={The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix},
author={Sharma, Asankhaya},
year={2025},
url={https://huggingface.co/blog/codelion/optimal-dataset-mixing/}
}Citationbibtex
@article{sharma2025billion,
title={The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix},
author={Sharma, Asankhaya},
year={2025},
url={https://huggingface.co/blog/codelion/optimal-dataset-mixing/}
}Citationbibtex
@article{sharma2025billion,
title={The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix},
author={Sharma, Asankhaya},
year={2025},
url={https://huggingface.co/blog/codelion/optimal-dataset-mixing/}
}Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.