GAIR
DeepResearcher-7b
DeepResearcher is the first comprehensive framework for end-to-end training of LLM-based deep research agents through scaling reinforcement learning (RL) in real-world environments with authentic web search interactions. Our qualitative analysis reveals emergent cognitive behaviors from end-to-end RL training, including the ability to formulate plans, cross-validate information from multiple sources, engage in self-reflection to redirect research, and maintain honesty when unable to find definitive answers. - License: Apache 2.0 - Model type: Reinforcement learning-based LLM (Large Language Model). - Language(s): The model is designed for tasks in English. - Finetuned from model: The model is built using the Qwen2.5-7B-Instruct architecture . - Repository: DeepResearcher GitHub . - Paper: DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments To get started, you can visit the DeepResearcher repository on GitHub, where the model's code and setup instructions are provided . The model was trained on open-domain question-answering datasets, including: - NaturalQuestions (NQ) - TriviaQA (TQ) - HotpotQA - 2Wiki MultiHopQA DeepResearcher was trained using reinforcement learning (RL) with the Group Relative Policy Optimization (GRPO) algorithm. It was tested in both in-domain (NQ, TQ, HotpotQA) and out-of-domain (Musique, Bamboogle, PopQA) settings . The model was evaluated on several datasets, including: - NQ (Natural Questions) - TQ (TriviaQA) - HotpotQA - 2Wiki - Musique - Bamboogle - PopQA . DeepResearcher outperforms all baseline models, achieving a substantial improvement in task completion across the datasets, particularly in out-of-domain scenarios.
ReasonEval-7B
ReasonEval-34B
LIMI-Air
daVinci-MagiHuman
LIMO-v2
daVinci-Dev-32B-MT
daVinci-Dev-72B
LIMO
rst-temporal-reasoning-11b
LIMI
LiveTalk-1.3B-V0.1
daVinci-Dev-72B-MT
Abel-7B-002
OpenSWE-72B
daVinci-Agency
rst-intent-detection-11b
SR Scientist 30B
SR-Scientist: Scientific Equation Discovery With Agentic AI This is the checkpoint from the RL training, which used Qwen3-Coder-30B-A3B-Instruct as a backbone in the paper 'SR-Scientist: Scientific Equation Discovery With Agentic AI'. For usage, please refer to the code. Please cite the paper if the resource in this repo or the paper is helpful to you.
Anole-7b-v0.1
confucius-confidence-verb
twgi-critique-anole-7b
We introduce Thinking with Generated Images, where we enable a single LMM (Large Multimodal Model) to spontaneously generate and reason with intermediate visual thoughts via a native long-multimodal thought process. This model supports vision generation with self-critique. Please refer to our github repo for more information!
rst-information-extraction-11b
daVinci-origin-3B
ToRL-1.5B
ToRL-7B
Anole-7b
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation Anole is the first open-source, autoregressive, and natively trained large multimodal model capable of interleaved image-text generation (without using stable diffusion). While it builds upon the strengths of Chameleon, Anole excels at the complex task of generating coherent sequences of alternating text and images. Through an innovative fine-tuning process using a carefully curated dataset of approximately 6,000 images, Anole achieves remarkable image generation and understanding capabilities with minimal additional training. This efficient approach, combined with its open-source nature, positions Anole as a catalyst for accelerated research and development in multimodal AI. Preliminary tests demonstrate Anole's exceptional ability to follow nuanced instructions, producing high-quality images and interleaved text-image content that closely aligns with user prompts. The major functionalities of Anole are listed below: - Text-to-Image Generation - Interleaved Text-Image Generation - Text Generation - MultiModal Understanding where Bold represents newly added capabilities on the basis of Chameleon. Please refer to our github repo and paper for examples generated by Anole!