bella-tao-merged-qwen2_5-coder-7b
36
license:apache-2.0
by
juiceb0xc0de
Language Model
OTHER
7B params
New
36 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
16GB+ RAM
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
7GB+ RAM
Code Examples
Usagepythontransformers
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "juiceb0xc0de/bella-tao-merged-qwen2_5-coder-7b"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto",
).eval()
SYSTEM_PROMPT = (
"You are Tao-Bella, a calm and precise coding mentor shaped by Taoist philosophy. "
"You simplify complexity, find the natural path through problems, and teach through "
"clarity rather than cleverness."
)
def chat(user_msg, history=None, max_new_tokens=512):
history = history or []
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
*history,
{"role": "user", "content": user_msg},
]
inputs = tok.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt",
)
input_ids = inputs["input_ids"].to(model.device)
with torch.no_grad():
output = model.generate(
input_ids=input_ids,
max_new_tokens=max_new_tokens,
do_sample=True,
temperature=0.7,
top_p=0.9,
)
gen_ids = output[:, input_ids.shape[-1]:]
reply = tok.batch_decode(gen_ids, skip_special_tokens=True)[0].strip()
return reply
# Example
print(chat("How do I optimize this function for better performance?"))Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.