mixedbread-ai

13 models • 7 total models in database

Sort by:

mxbai-embed-large-v1

--- tags: - mteb - transformers.js - transformers model-index: - name: mxbai-angle-large-v1 results: - task: type: Classification dataset: type: mteb/amazon_counterfactual name: MTEB AmazonCounterfactualClassification (en) config: en split: test revision: e8379541af4e31359cca9fbcf4b00f2671dba205 metrics: - type: accuracy value: 75.044776119403 - type: ap value: 37.7362433623053 - type: f1 value: 68.92736573359774 - task: type: Classification dataset: type: mteb/amazon_polarity name: MTEB AmazonP

license:apache-2.0

1,512,093

754

mxbai-rerank-xsmall-v1

--- library_name: transformers tags: - reranker - transformers.js - sentence-transformers license: apache-2.0 language: - en pipeline_tag: text-ranking ---

license:apache-2.0

799,887

mxbai-edge-colbert-v0-17m

--- language: - en tags: - ColBERT - PyLate - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - transformers pipeline_tag: sentence-similarity library_name: PyLate metrics: - MaxSim_accuracy@1 - MaxSim_accuracy@3 - MaxSim_accuracy@5 - MaxSim_accuracy@10 - MaxSim_precision@1 - MaxSim_precision@3 - MaxSim_precision@5 - MaxSim_precision@10 - MaxSim_recall@1 - MaxSim_recall@3 - MaxSim_recall@5 - MaxSim_recall@10 - MaxSim_ndcg@10 - MaxSim_mrr@10 - MaxSim_map@

license:apache-2.0

332,853

mxbai-edge-colbert-v0-32m

The crispy, lightweight ColBERT family from Mixedbread . 🍞 Looking for a simple end-to-end retrieval solution? Meet Mixedbread Search , our multi-modal and multi-lingual search solution. This mode...

license:apache-2.0

42,174

mxbai-rerank-large-v1

🍞 Looking for a simple end-to-end retrieval solution? Meet Omni, our multimodal and multilingual model. Get in touch for access. This is the largest model in our family of powerful reranker models. You can learn more about the models in our blog post. - mxbai-rerank-xsmall-v1 - mxbai-rerank-base-v1 - mxbai-rerank-large-v1 (🍞) Currently, the best way to use our models is with the most recent version of sentence-transformers. Let's say you have a query, and you want to rerank a set of documents. You can do that with only one line of code: Let's say you have a query, and you want to rerank a set of documents. In JavaScript, you need to add a function: The API comes with additional features, such as a continous trained reranker! Check out the docs for more information. Our reranker models are designed to elevate your search. They work extremely well in combination with keyword search and can even outperform semantic search systems in many cases. | Model | NDCG@10 | Accuracy@3 | | ------------------------------------------------------------------------------------- | -------- | ---------- | | Lexical Search (Lucene) | 38.0 | 66.4 | | BAAI/bge-reranker-base | 41.6 | 66.9 | | BAAI/bge-reranker-large | 45.2 | 70.6 | | cohere-embed-v3 (semantic search) | 47.5 | 70.9 | | mxbai-rerank-xsmall-v1 | 43.9 | 70.0 | | mxbai-rerank-base-v1 | 46.9 | 72.3 | | mxbai-rerank-large-v1 | 48.8 | 74.9 | The reported results are aggregated from 11 datasets of BEIR. We used Pyserini to evaluate the models. Find more in our blog-post and on this spreadsheet. Community Please join our Discord Community and share your feedback and thoughts! We are here to help and also always happy to chat.

license:apache-2.0

23,702

135

mxbai-embed-xsmall-v1

The crispy sentence embedding family from Mixedbread . 🍞 Looking for a simple end-to-end retrieval solution? Meet Omni, our multimodal and multilingual model. Get in touch for access. This model is an open-source English embedding model developed by Mixedbread. It's built upon sentence-transformers/all-MiniLM-L6-v2 and trained with the AnglE loss and Espresso. Read more details in our blog post. In a bread loaf: - State-of-the-art performance - Supports both binary quantization and Matryoshka Representation Learning (MRL). - Optimized for retrieval tasks - 4096 context support Our model supports both binary quantization and Matryoshka Representation Learning (MRL), allowing for significant efficiency gains: - Binary quantization: Retains 93.9% of performance while increasing efficiency by a factor of 32 - MRL: A 33% reduction in vector size still leaves 96.2% of model performance These optimizations can lead to substantial reductions in infrastructure costs for cloud computing and vector databases. Read more here. Here are several ways to produce German sentence embeddings using our model. Join our discord community to share your feedback and thoughts. We're here to help and always happy to discuss the exciting field of machine learning!

license:apache-2.0

21,409