LCO-Embedding

2 models • 1 total models in database
Sort by:

LCO-Embedding-Omni-3B

LCO-Embedding: Scaling Language-Centric Omnimodal Representation Learning We are thrilled to release LCO-Embedding - a language-centric omnimodal representation learning framework and the LCO-Embedding model families! This model implements the framework presented in the paper Scaling Language-Centric Omnimodal Representation Learning, accepted by NeurIPS 2025. Github Repository: https://github.com/LCO-Embedding/LCO-Embedding Note: We are only using the `thinker` component of Qwen2.5 Omni and drops the `talker` component. We introduce LCO-Embedding, a language-centric omnimodal representation learning method and the LCO-Embedding model families, setting a new state-of-the-art on MIEB (Massive Image Embedding Benchmark), while supporting audio and videos. This work also introduces the Generation-Representation Scaling Law, connecting models' generative capabilities and their representation upper bound. Furthermore, we introduce SeaDoc, a challenging visual document retrieval task in Southeast Asian languages, and show that continual generative pretraining before contrastive learning raises the representation upper bound. We evaluate LCO-Embedding with state-of-the-art embedding models, including E5-V, Voyage Multimodal 3, mmE5, and GME, on a MIEB-Lite benchmark (51 tasks) broken down by task categories. Performance and efficiency comparisons of different training strategies using 3B and 7B variants of Qwen2.5-VL backbones. Scaling relationship between generation benchmark performance (X-axis) and representation benchmark performance after language-centric contrastive learning (Y-axis). If you find LCO-Embedding useful for your research and applications, please cite using this BibTeX:

NaNK
license:apache-2.0
1,917
7

LCO-Embedding-Omni-7B

NaNK
license:apache-2.0
648
14