Axion1-350k-A250k
1
license:mit
by
AxionLab-official
Language Model
OTHER
New
0 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Code Examples
Model Detailstext
d_model : 64
n_layers : 4
n_heads : 4 (MLA)
d_head : 16
kv_lora_rank : 8 (MLA KV compression)
q_lora_rank : 16 (MLA Q compression)
n_shared_experts : 1
n_routed_experts : 4 (top-2 activated)
d_ff : 64 (per expert)
vocab_size : 1024 (BPE, trained on GSM8K)
max_seq_len : 512
total_params : 343,616
active_params/tok : ~160,000Training Curvepythontransformers
from transformers import AutoModelForCausalLM, LogitsProcessor, LogitsProcessorList
from tokenizer import BPETokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"AxionLab-official/Axion1-350k-A250k",
trust_remote_code=True
)
model.eval()
tok = BPETokenizer.load("model.vocab", "model.model")
# Bloqueia EOS e PAD nos primeiros min_tokens gerados
class MinNewTokens(LogitsProcessor):
def __init__(self, min_tokens: int, eos_id: int, pad_id: int):
self.min_tokens = min_tokens
self.bad = [eos_id, pad_id]
self.generated = 0
def __call__(self, input_ids, scores):
if self.generated < self.min_tokens:
for bid in self.bad:
scores[:, bid] = float("-inf")
self.generated += 1
return scores
eos_id = tok.token2id["<eos>"]
pad_id = tok.token2id["<pad>"]
prompt = "# Pergunta:\nQuanto é 5 + 3?\n--\n# Resposta:\n"
ids = tok.encode(prompt, add_bos=True, add_eos=False)
input_ids = torch.tensor([ids])
with torch.no_grad():
output = model.generate(
input_ids,
max_new_tokens=80,
temperature=0.9,
do_sample=True,
top_k=50,
top_p=0.95,
eos_token_id=eos_id,
pad_token_id=pad_id,
use_cache=False,
logits_processor=LogitsProcessorList([
MinNewTokens(min_tokens=5, eos_id=eos_id, pad_id=pad_id)
]),
)
new_tokens = output[0][len(ids):].tolist()
# Remove EOS do final se presente
if new_tokens and new_tokens[-1] == eos_id:
new_tokens = new_tokens[:-1]
print("Resposta:", tok.decode(new_tokens))Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.