Menlo
Lucy-128k-gguf
Lucy: Edgerunning Agentic Web Search on Mobile with a 1.7B model. [](https://github.com/menloresearch/deep-research) [](https://opensource.org/licenses/Apache-2.0) Authors: Alan Dao, Bach Vu Dinh, Alex Nguyen, Norapat Buppodom Lucy is a compact but capable 1.7B model focused on agentic web search and lightweight browsing. Built on Qwen3-1.7B, Lucy inherits deep research capabilities from larger models while being optimized to run efficiently on mobile devices, even with CPU-only configurations. We achieved this through machine-generated task vectors that optimize thinking processes, smooth reward functions across multiple categories, and pure reinforcement learning without any supervised fine-tuning. - 🔍 Strong Agentic Search: Powered by MCP-enabled tools (e.g., Serper with Google Search) - 🌐 Basic Browsing Capabilities: Through Crawl4AI (MCP server to be released), Serper,... - 📱 Mobile-Optimized: Lightweight enough to run on CPU or mobile devices with decent speed - 🎯 Focused Reasoning: Machine-generated task vectors optimize thinking processes for search tasks Evaluation Following the same MCP benchmark methodology used for Jan-Nano and Jan-Nano-128k, Lucy demonstrates impressive performance despite being only a 1.7B model, achieving higher accuracy than DeepSeek-v3 on SimpleQA. Lucy can be deployed using various methods including vLLM, llama.cpp, or through local applications like Jan, LMStudio, and other compatible inference engines. The model supports integration with search APIs and web browsing tools through the MCP. Paper : Lucy: edgerunning agentic web search on mobile with machine generated task vectors.
Jan-nano-128k-gguf
Jan-Nano-128k: Empowering deeper research through extended context understanding. Note: Jan-Nano is a non-thinking model. [](https://github.com/menloresearch/deep-research) [](https://opensource.org/licenses/Apache-2.0) Jan-Nano-128k represents a significant advancement in compact language models for research applications. Building upon the success of Jan-Nano, this enhanced version features a native 128k context window that enables deeper, more comprehensive research capabilities without the performance degradation typically associated with context extension methods. Key Improvements: - 🔍 Research Deeper: Extended context allows for processing entire research papers, lengthy documents, and complex multi-turn conversations - ⚡ Native 128k Window: Built from the ground up to handle long contexts efficiently, maintaining performance across the full context range - 📈 Enhanced Performance: Unlike traditional context extension methods, Jan-Nano-128k shows improved performance with longer contexts This model maintains full compatibility with Model Context Protocol (MCP) servers while dramatically expanding the scope of research tasks it can handle in a single session. Jan-Nano-128k has been rigorously evaluated on the SimpleQA benchmark using our MCP-based methodology, demonstrating superior performance compared to its predecessor: Traditional approaches to extending context length, such as YaRN (Yet another RoPE extensioN), often result in performance degradation as context length increases. Jan-Nano-128k breaks this paradigm: This fundamental difference makes Jan-Nano-128k ideal for research applications requiring deep document analysis, multi-document synthesis, and complex reasoning over large information sets. Jan desktop will eventually support this model (WIP). Otherwise you can check the deployment options below that we have tested. For additional tutorials and community guidance, visit our Discussion Forums. Note: The chat template is included in the tokenizer. For troubleshooting, download the Non-think chat template. FAQ: - I have Jinja template issue with LMStudio, how can i fix? Here - Discussions: HuggingFace Community - Issues: GitHub Repository - Documentation: Official Docs Jan-Nano-128k: Empowering deeper research through extended context understanding.
Lucy-gguf
Jan-nano-gguf
Jan Nano is a fine-tuned language model built on top of the Qwen3 architecture. Developed as part of the Jan ecosystem, it balances compact size and extended context length, making it ideal for efficient, high-quality text generation in local or embedded environments. - Tool Use: Excellent function calling and tool integration - Research: Enhanced research and information processing capabilities - Small Model: VRAM efficient for local deployment Original weight: https://huggingface.co/Menlo/Jan-nano - Temperature: 0.7 - Top-p: 0.8 - Top-k: 20 - Min-p: 0 - 📄 Citation
Jan-nano
Jan-Nano: An Agentic Model Note: Jan-Nano is a non-thinking model. Jan-Nano is a compact 4-billion parameter language model specifically designed and trained for deep research tasks. This model has been optimized to work seamlessly with Model Context Protocol (MCP) servers, enabling efficient integration with various research tools and data sources. Evaluation Jan-Nano has been evaluated on the SimpleQA benchmark using our MCP-based benchmark methodology, demonstrating strong performance for its model size: The evaluation was conducted using our MCP-based benchmark approach, which assesses the model's performance on SimpleQA tasks while leveraging its native MCP server integration capabilities. This methodology better reflects Jan-Nano's real-world performance as a tool-augmented research model, validating both its factual accuracy and its effectiveness in MCP-enabled environments. Jan-Nano is currently supported by Jan, an open-source ChatGPT alternative that runs entirely on your computer. Jan provides a user-friendly interface for running local AI models with full privacy and control. For non-jan app or tutorials there are guidance inside community section, please check those out! Discussion VLLM Here is an example command you can use to run vllm with Jan-nano Chat-template is already included in tokenizer so chat-template is optional, but in case it has issue you can download the template here Non-think chat template - Temperature: 0.7 - Top-p: 0.8 - Top-k: 20 - Min-p: 0 - 📄 Citation
AlphaMaze-v0.2-1.5B
Lucy
Lucy-128k
Lucy: Edgerunning Agentic Web Search on Mobile with a 1.7B model. [](https://github.com/menloresearch/deep-research) [](https://opensource.org/licenses/Apache-2.0) Authors: Alan Dao, Bach Vu Dinh, Alex Nguyen, Norapat Buppodom Lucy is a compact but capable 1.7B model focused on agentic web search and lightweight browsing. Built on Qwen3-1.7B, Lucy inherits deep research capabilities from larger models while being optimized to run efficiently on mobile devices, even with CPU-only configurations. We achieved this through machine-generated task vectors that optimize thinking processes, smooth reward functions across multiple categories, and pure reinforcement learning without any supervised fine-tuning. - 🔍 Strong Agentic Search: Powered by MCP-enabled tools (e.g., Serper with Google Search) - 🌐 Basic Browsing Capabilities: Through Crawl4AI (MCP server to be released), Serper,... - 📱 Mobile-Optimized: Lightweight enough to run on CPU or mobile devices with decent speed - 🎯 Focused Reasoning: Machine-generated task vectors optimize thinking processes for search tasks Evaluation Following the same MCP benchmark methodology used for Jan-Nano and Jan-Nano-128k, Lucy demonstrates impressive performance despite being only a 1.7B model, achieving higher accuracy than DeepSeek-v3 on SimpleQA. Lucy can be deployed using various methods including vLLM, llama.cpp, or through local applications like Jan, LMStudio, and other compatible inference engines. The model supports integration with search APIs and web browsing tools through the MCP. Paper : Lucy: edgerunning agentic web search on mobile with machine generated task vectors.
Jan Nano 128k
ReZero-v0.1-llama-3.2-3b-it-grpo-250404
Ichigo-llama3.1-s-instruct-v0.3-phase-2
mini-Ichigo-llama3.2-3B-s-instruct
ReZero-v0.1-llama-3.2-3b-it-grpo-250404-gguf
llama3-s-instruct-v0.2-GGUF
Ichigo-llama3.1-s-instruct-v0.3-phase-3
llama3-s-gguf
mini-Ichigo-llama3.2-3B-s-base
Ichigo-llama3.1-s-instruct-v0.4
llama3-s-v0.1
llama3-s-2024-07-08
Ichigo-llama3.1-s-base-v0.3
Qwen3-4B-warmup-ds
Poseless-3B
AlphaMaze-v0.2-1.5B-GGUF
llama3-s-instruct-v0.2
Ichigo-whisper-v0.1
AlphaSpace-1.5B
llama3.1-s-instruct-2024-08-15
Qwen2.5-0.5B-s-init
Ichigo-llama3.1-8B-v0.5-cp-10000
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]