Bingguang

2 models • 1 total models in database

Sort by:

FunReason-MT

FunReason-MT Technical Report: Advanced Data Synthesis Solution for Real-world Multi-Turn Tool-use [](https://arxiv.org/abs/2510.24645) [](https://huggingface.co/papers/2510.24645) [](https://huggingface.co/Bingguang/FunReason-MT) [](https://huggingface.co/datasets/Bingguang/FunReason-MT) [](https://github.com/inclusionAI/AWorld-RL) [](https://github.com/inclusionAI/AWorld) The FunReason-MT-4B model is a high-performance Large Language Model (LLM) fine-tuned for complex, multi-turn Function Calling (FC) and agentic tool-use tasks. Built upon the Qwen3-4B-Instruct-2507 base model , it has been trained using the novel FunReason-MT data synthesis framework. FunReason-MT-4B achieves ssuperior results on the Berkeley Function-Calling Leaderboard (BFCLv3) Multi-Turn and Agentic Evaluation benchmarks. This performance demonstrates that high-quality, synthesized data can effectively overcome the complexity barrier in multi-turn FC data generation. - Base Model: Qwen3-4B-Instruct-2507 - Size: 4 Billion parameters - Key Capability: Advanced Multi-Turn Function Calling and Agentic Tool-Use The model was rigorously evaluated on the Berkeley Function-Calling Leaderboard (BFCL). | Model (4B - 235B) | Multi-Turn (Overall) | Single-Turn (Overall) | | :------------------------------------- | :------------------------------------------: | :------------------------------------------: | | Qwen3-4B-Instruct (Base) | 15.75 | 78.19 | | Qwen3-4B + FunReason-MT (RL) | 57.75 | 85.47 | | Claude-Sonnet-4-20250514 | 54.75 | 84.72 | | DeepSeek-R1-0528 | 44.50 | 78.22 | | GPT-4o-2024-11-20 | 42.50 | 77.21 | The FunReason-MT trained model leads in out-of-distribution agentic tasks (Web Search and Memory). | Model | BFCLv4 Overall Score | | :----------------------------- | :------------------------------------------: | | FunReason-MT-4B (RL) | 15.10 | | ToolACE-2-8B | 14.83 | | BitAgent-8B | 8.24 | | XLAM-2-3b-fc-r | 7.42 | | watt-tool-8B | 6.30 | The training set comprises 16,000 high-quality multi-turn samples. This dataset was generated using the three-phase FunReason-MT data synthesis framework, which focuses on generating complex trajectories that require: 1. Environment-API Graph Interactions for collecting goal-directed, correct execution traces. 2. Advanced Tool-Query Synthesis for creating logical-jump queries that abstract multi-step actions. 3. Guided Iterative Chain for enforcing reliable, consistent Chain-of-Thought (CoT) generation using self-correction. The model was fine-tuned with function calling data from APIGen and the FunReason-MT dataset. - Training Libraries: LLama-Factory and Verl. - Methodology: Supervised Fine-Tuning (SFT) followed by Reinforcement Learning (RL). - Hardware: Conducted on 32 NVIDIA H20 GPUs. Usage Here we provide a code snippet of the handler of FunReason-MT. This work is part of the open-source project AWorld, InclusionAI. If you use FunReason-MT in your research, please cite the technical report:

license:apache-2.0

130

FunReason

NaNK

license:apache-2.0