flashresearch

1 models • 1 total models in database

Sort by:

FlashResearch-4B-Thinking

[](https://huggingface.co/your-username/your-model-name) [](#license) [](https://huggingface.co/datasets/cheapresearch/CheapResearch-DS-33k) A 4B-parameter Qwen model distilled from Tongyi DeepResearch-30B A3B, optimized for web-scale “deep research” tasks and inference with Alibaba-NLP/DeepResearch. Base: Qwen 4B (dense) Teacher: Tongyi DeepResearch 30B A3B (MoE) Method: SFT distillation on 33k curated deep-research examples Dataset: `flashresearch/FlashResearch-DS-33k` Primary Use: Fast, low-cost DeepResearch agent runs (browsing, multi-step reasoning, source-grounded answers) Primary dataset: `flashresearch/FlashResearch-DS-33k` Inference with Alibaba-NLP/DeepResearch (Recommended) This model is intended to be used directly with the DeepResearch repo. Single 12–16GB GPU is enough for 4B FP16; FP8/INT4 quantization allows smaller VRAM. If you quantize, the summary model can be local as well. Qwen team for the base 4B architecture Alibaba-NLP for DeepResearch CheapResearch contributors for the 33k dataset v1.0.0 (2025-10-04) — First public release (33k distillation, DeepResearch-ready)

NaNK

license:mit