📖 Introduction We used a data engine built with Capflow-72B to caption multi-source data. This data was then used to train Qwen3-8B, resulting in MetaCaptioner-8B. MetaCaptioner-8B demonstrates outstanding image description capabilities, excelling at generating comprehensive descriptions that incorporate visual perception and understanding. Furthermore, MetaCaptioner-8B outperforms InternVL3.5-8B-Instruct on multiple multimodal understanding and reasoning benchmarks.