LongCat-Flash-Thinking-2601
557
79
license:mit
by
meituan-longcat
Language Model
OTHER
New
557 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Code Examples
Basic Usagepython
text = tokenizer.apply_chat_template(
messages,
tools=tools,
tokenize=False,
enable_thinking=True,
add_generation_prompt=True,
save_history_reasoning_content=False
)Implementation Examplespythontransformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "meituan-longcat/LongCat-Flash-Thinking-2601"
# Load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Please tell me what is $$1 + 1$$ and $$2 \times 2$$?"},
{"role": "assistant", "reasoning_content": "This question is straightforward: $$1 + 1 = 2$$ and $$2 \times 2 = 4$$.", "content": "The answers are 2 and 4."},
{"role": "user", "content": "Check again?"}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
enable_thinking=True,
add_generation_prompt=True,
save_history_reasoning_content=False # Discard reasoning history to save tokens
)
# Template Output Structure:
# <longcat_system>You are a helpful assistant.<longcat_user>Please tell me what is $$1 + 1$$ and $$2 \times 2$$?<longcat_assistant>The answers are 2 and 4</longcat_s><longcat_user>Check again? /think_on <longcat_assistant><longcat_think>\n
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# Generate response
generated_ids = model.generate(
**model_inputs,
max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
print(tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n"))
# Example Output:
# The user wants a double-check. Since $$1 + 1 = 2$$ and $$2 \times 2 = 4$$ are basic arithmetic truths, the previous answer is correct.\n</longcat_think>\nI have verified the calculations: $$1 + 1 = 2$$ and $$2 \times 2 = 4$$. The initial answer remains correct.</longcat_s>2. Tool Callingpython
tools = [
{
"type": "function",
"function": {
"name": "func_add",
"description": "Calculate the sum of two numbers",
"parameters": {
"type": "object",
"properties": {
"x1": {"type": "number", "description": "The first addend"},
"x2": {"type": "number", "description": "The second addend"}
},
"required": ["x1", "x2"]
}
}
}
]
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Please tell me what is $$125679 + 234519$$?"},
{
"role": "assistant",
"reasoning_content": "This calculation requires precision; I will use the func_add tool.",
"tool_calls": [{"type": "function", "function": {"name": "func_add", "arguments": {"x1": 125679, "x2": 234519}}}]
},
{"role": "tool", "name": "func_add", "content": '{"ans": 360198}'}
]
text = tokenizer.apply_chat_template(
messages,
tools=tools,
tokenize=False,
enable_thinking=True,
add_generation_prompt=True,
save_history_reasoning_content=False
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# Generate response based on tool result
generated_ids = model.generate(
**model_inputs,
max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
print(tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n"))Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.