Aurora-Spec-Minimax-M2.1
108
1
llama
by
togethercomputer
Language Model
OTHER
New
108 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Code Examples
Usagepython
import sglang as sgl
def main():
# Sample prompts
prompts = [
"Explain the concept of quantum computing:",
"Write a short story about a time traveler:",
"Describe the process of photosynthesis:",
]
# Create sampling params
sampling_params = {"temperature": 0.7, "max_new_tokens": 256}
# Initialize engine with speculative decoding (lookahead 4 - recommended)
llm = sgl.Engine(
model_path="MiniMax/M2.1",
speculative_draft_model_path="togethercomputer/Aurora-Spec-Minimax-M2.1",
speculative_algorithm="EAGLE3",
speculative_num_steps=4, # Recommended: lookahead 4
speculative_eagle_topk=1,
speculative_num_draft_tokens=6,
dtype="bfloat16",
trust_remote_code=True,
)
# Generate with speculative decoding
outputs = llm.generate(prompts, sampling_params)
# Print the outputs
for prompt, output in zip(prompts, outputs):
print("=" * 50)
print(f"Prompt: {prompt}")
print(f"Generated: {output['text']}")
# The __main__ condition is necessary when using spawn to create subprocesses
if __name__ == "__main__":
main()Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.