NovaSearch
stella_en_400M_v5
--- model-index: - name: stella_en_400M_v5 results: - dataset: config: en name: MTEB AmazonCounterfactualClassification (en) revision: e8379541af4e31359cca9fbcf4b00f2671dba205 split: test type: mteb/amazon_counterfactual metrics: - type: accuracy value: 92.35820895522387 - type: ap value: 70.81322736988783 - type: ap_weighted value: 70.81322736988783 - type: f1 value: 88.9505466159595 - type: f1_weighted value: 92.68630932872613 - type: main_score value: 92.35820895522387 task: type: Classificat
stella_en_1.5B_v5
We released a Jasper and Stella model technology report and code.(2025.1) Codes: https://github.com/NovaSearch-Team/RAG-Retrieval The models are trained based on `Alibaba-NLP/gte-large-en-v1.5` and `Alibaba-NLP/gte-Qwen2-1.5B-instruct`. Thanks for their contributions! We simplify usage of prompts, providing two prompts for most general tasks, one is for s2p, another one is for s2s. Prompt of s2s task(e.g. semantic textual similarity task): The models are finally trained by MRL), so they have multiple dimensions: 512, 768, 1024, 2048, 4096, 6144 and 8192. The higher the dimension, the better the performance. Generally speaking, 1024d is good enough. The MTEB score of 1024d is only 0.001 lower than 8192d. The model directory structure is very simple, it is a standard SentenceTransformer directory with a series of `2Dense{dims}` folders, where `dims` represents the final vector dimension. For example, the `2Dense256` folder stores Linear weights that convert vector dimensions to 256 dimensions. Please refer to the following chapters for specific instructions on how to use them. You can use `SentenceTransformers` or `transformers` library to encode text. Usage with Infinity, MIT Licensed Inference Server and Docker. A: The training method and datasets will be released in the future. (specific time unknown, may be provided in a paper) Q: How to choose a suitable prompt for my own task? A: In most cases, please use the s2p and s2s prompts. These two prompts account for the vast majority of the training data. A: Please use evaluation scripts in `Alibaba-NLP/gte-Qwen2-1.5B-instruct` or `intfloat/e5-mistral-7b-instruct` A: MRL has multiple training methods, we choose this method which has the best performance. A: 512 is recommended, in our experiments, almost all models perform poorly on specialized long text retrieval datasets. Besides, the model is trained on datasets of 512 length. This may be an optimization term. If you have any questions, please start a discussion on community.
jasper_en_vision_language_v1
Based on dunzhang/stellaen1.5Bv5 and google/siglip-so400m-patch14-384. Data: https://huggingface.co/datasets/infgrad/jaspertextdistilldataset Training logs: https://api.wandb.ai/links/dunnzhang0/z8jqoqpb The core idea of jasper and stella is distillation: Let student model learn teacher model's vectors. script: ./scripts/evaluateenmteb/runevaluatemteb.py License This model should not be used for any commercial purpose!