This model was trained by richinfoai. Followed Stella and Jasper models, we do distillation training from lier007/xiaobu-embedding-v2, dunzhang/stella-large-zh-v3-1792d and BAAI/bge-multilingual-gemma2. Thanks to their outstanding performance, our model has achieved excellent results on MTEB(cmn, v1).
We believe this model once again demonstrates the effectiveness of distillation learning. In the future, we will train more bilingual vector models based on various excellent vector training methods.
We use BAAI/Infinity-Instruct and opencsg/chinese-fineweb-edu as training data to do a distillation from the above three models. In this stage, we only use cosine-loss.
The objective of stage2 is reducing dimensions. We use the same training data as the stage1 with `similarity loss`. After stage2, the dimensions of our model is 1792.
This model does not need instructions and you can use it in `SentenceTransformer`: