naver
splade-cocondenser-ensembledistil
--- license: cc-by-nc-sa-4.0 language: "en" tags: - splade - query-expansion - document-expansion - bag-of-words - passage-retrieval - knowledge-distillation - sentence-transformers - sparse-encoder - sparse pipeline_tag: feature-extraction library_name: sentence-transformers datasets: - ms_marco ---
efficient-splade-VI-BT-large-doc
--- license: cc-by-nc-sa-4.0 language: "en" tags: - splade - query-expansion - document-expansion - bag-of-words - passage-retrieval - knowledge-distillation - document encoder - sentence-transformers - sparse-encoder - sparse - asymmetric pipeline_tag: feature-extraction library_name: sentence-transformers datasets: - ms_marco ---
efficient-splade-VI-BT-large-query
--- license: cc-by-nc-sa-4.0 language: "en" tags: - splade - query-expansion - document-expansion - bag-of-words - passage-retrieval - knowledge-distillation - document encoder datasets: - ms_marco ---
splade-v3
MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric
DUSt3R_ViTLarge_BaseDecoder_512_dpt
License The code is distributed under the CC BY-NC-SA 4.0 License. See LICENSE for more information. For the checkpoints, make sure to agree to the license of all the public training datasets and base checkpoints we used, in addition to CC-BY-NC-SA 4.0. See section: Our Hyperparameters for details. Gihub page: https://github.com/naver/dust3r/ Project page: https://dust3r.europe.naverlabs.com/ | Modelname | Training resolutions | Head | Encoder | Decoder | |-------------|----------------------|------|---------|---------| | DUSt3RViTLargeBaseDecoder512dpt | 512x384, 512x336, 512x288, 512x256, 512x160 | DPT | ViT-L | ViT-B |
splade-cocondenser-selfdistil
splade_v2_distil
DUSt3R_ViTLarge_BaseDecoder_512_linear
splade-v3-distilbert
splade-v3-lexical
efficient-splade-V-large-doc
splade_v2_max
provence-reranker-debertav3-v1
Provence is a lightweight context pruning model for retrieval-augmented generation, particularly optimized for question answering. Given a user question and a retrieved passage, Provence removes sentences from the passage that are not relevant to the user question. This speeds up generation and reduces context noise, in a plug-and-play manner for any LLM. Paper: https://arxiv.org/abs/2501.16214, accepted to ICLR 2025 Blogpost: https://huggingface.co/blog/nadiinchi/provence Developed by: Naver Labs Europe License: Provence is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 license [CC BY-NC-ND 4.0 license]. License file Model: `provence-reranker-debertav3-v1` (Provence for Pruning and Reranking Of retrieVEd relevaNt ContExt) Backbone model: DeBERTav3-reranker (trained from DeBERTa-v3-large) Model size: 430 million parameters Context length: 512 tokens NEW! The multilingual version of the model, based on BGE-reranker-v2-m3, is available here. Training and evaluation code & data are available in the Bergen repo You can also pass a list of questions and a list of lists of contexts (multiple contexts per question to be pruned) for batched processing. Setting `alwaysselecttitle=True` will keep the first sentence "Shepherd’s pie". This is especially useful for Wikipedia articles where the title is often needed to understand the context. More details on how the title is defined are given below. Interface of the `process` function: `question`: `Union[List[str], str]`: an input question (str) or a list of input questions (for batched processing) `context`: `Union[List[List[str]], str]`: context(s) to be pruned. This can be either a single string (in case of a singe str question), or a list of lists contexts (a list of contexts per question), with `len(contexts)` equal to `len(questions)` `title`: `Optional[Union[List[List[str]], str]]`, default: “firstsentence”: an optional argument for defining titles. If `title=firstsentence`, then the first sentence of each context is assumed to be the title. If `title=None`, then it is assumed that no titles are provided. Titles can be also passed as a list of lists of str, i.e. titles shaped the same way as contexts. Titles are only used if `alwaysselecttitle=True`. `threshold` (float, $ \in [0, 1]$, default 0.1): which threshold to use for context pruning. We recommend 0.1 for more conservative pruning (no performance drop or lowest performance drops) and 0.5 for higher compression, but this value can be further tuned to meet the specific use case requirements. `alwaysselecttitle` (bool, default: True): if True, the first sentence (title) will be included into the selection each time the model select a non-empty selection of sentences. This is important, e.g., for Wikipedia passages, to provide proper contextualization for the next sentences. `batchsize` (int, default: 32) `reorder` (bool, default: False): if True, the provided contexts for each question will be reordered according to the computed question-passage relevance scores. If False, the original user-provided order of contexts will be preserved. `topk` (int, default: 5): if `reorder=True`, specifies the number of top-ranked passages to keep for each question. `enablewarnings` (bool, default: True): whether the user prefers the warning about model usage to be printed, e.g. too long contexts or questions. Provence encodes all sentences in the passage together: this enables capturing of coreferences between sentences and provides more accurate context pruning. Provence automatically detects the number of sentences to keep, based on a threshold. We found that the default value of a threshold works well across various domains, but the threshold can be adjusted further to better meet the particular use case needs. Provence is robust to various domains, being trained on a combination of diverse MS Marco and Natural Questions data. Provence works out-of-the-box with any LLM. Input: user question (e.g., a sentence) + retrieved context passage (e.g., a paragraph) Output: pruned context passage, i.e., irrelevant sentences are removed + relevance score (can be used for reranking) Model Architecture: The model was initialized from DeBERTav3-reranker and finetuned with two objectives: (1) output a binary mask which can be used to prune irrelevant sentences; and (2) preserve initial reranking capabilities. Training data: MS Marco (document) + NQ training sets, with synthetic silver labelling of which sentences to keep, produced using LLama-3-8B. Languages covered: English Context length: 512 tokens (similar to the pretrained DeBERTa model) Evaluation: we evaluate Provence on 7 datasets from various domains: Wikipedia, biomedical data, course syllabi, and news. Evaluation is conducted on the model trained only on MS Marco data. We find that Provence is able to prune irrelevant sentences with little-to-no drop in performance, in all domains, and outperforms existing baselines on the Pareto front (top right corners of the plots). Model trained at Naver Labs Europe Team: Nadia Chirkova Thibault Formal Vassilina Nikoulina Stéphane Clinchant
xprovence-reranker-bgem3-v1
XProvence is a Zero Cost context pruning model that seamlessly integrates with reranker for retrieval-augmented generation, particularly optimized for question answering. Given a user question and a retrieved passage, XProvence removes sentences from the passage that are not relevant to the user question. This speeds up generation and reduces context noise, in a plug-and-play manner for any LLM. XProvence is a multilingual version of Provence supporting 16 languages natively. It also supports 100+ languages through cross lingual transfer, since it is based on BGE-m3 which is pretrained on 100+ languages. Developed by: Naver Labs Europe License: XProvence is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 license [CC BY-NC-ND 4.0 license]. License file Model: `XProvence` Backbone model: bge-reranker-v2-m3 Model size: 568 million parameters Context length: 8192 tokens Training and evaluation code & data are available in the Bergen repo You can also pass a list of questions and a list of lists of contexts (multiple contexts per question to be pruned) for batched processing. Setting `alwaysselecttitle=True` will keep the first sentence "Shepherd’s pie". This is especially useful for Wikipedia articles where the title is often needed to understand the context. More details on how the title is defined are given below. Interface of the `process` function: `question`: `Union[List[str], str]`: an input question (str) or a list of input questions (for batched processing) `context`: `Union[List[List[str]], str]`: context(s) to be pruned. This can be either a single string (in case of a singe str question), or a list of lists contexts (a list of contexts per question), with `len(contexts)` equal to `len(questions)` `title`: `Optional[Union[List[List[str]], str]]`, default: “firstsentence”: an optional argument for defining titles. If `title=firstsentence`, then the first sentence of each context is assumed to be the title. If `title=None`, then it is assumed that no titles are provided. Titles can be also passed as a list of lists of str, i.e. titles shaped the same way as contexts. Titles are only used if `alwaysselecttitle=True`. `threshold` (float, \\(\in [0, 1]\\), default 0.3): which threshold to use for context pruning. We recommend 0.3 for more conservative pruning (no performance drop or lowest performance drops) and 0.7 for higher compression, but this value can be further tuned to meet the specific use case requirements. `alwaysselecttitle` (bool, default: True): if True, the first sentence (title) will be included into the selection each time the model select a non-empty selection of sentences. This is important, e.g., for Wikipedia passages, to provide proper contextualization for the next sentences. `batchsize` (int, default: 32) `reorder` (bool, default: False): if True, the provided contexts for each question will be reordered according to the computed question-passage relevance scores. If False, the original user-provided order of contexts will be preserved. `topk` (int, default: 5): if `reorder=True`, specifies the number of top-ranked passages to keep for each question. `enablewarnings` (bool, default: True): whether the user prefers the warning about model usage to be printed, e.g. too long contexts or questions. XProvence natively supports 16 languages. XProvence supports 100+ languages via cross lingual transfer. XProvence encodes all sentences in the passage together: this enables capturing of coreferences between sentences and provides more accurate context pruning. XProvence automatically detects the number of sentences to keep, based on a threshold. We found that the default value of a threshold works well across various domains, but the threshold can be adjusted further to better meet the particular use case needs. XProvence works out-of-the-box with any LLM. Input: user question (e.g., a sentence) + retrieved context passage (e.g., a paragraph). Training data consisted of monolingual examples (query and context in the same language), but we expect the model to perform well on cross-lingual pairs too, due to cross-lingual transfer. Output: pruned context passage, i.e., irrelevant sentences are removed + relevance score (can be used for reranking) Model Architecture: The model was initialized from bge-reranker-v2-m3 and finetuned with two objectives: (1) output a binary mask which can be used to prune irrelevant sentences; and (2) preserve initial reranking capabilities. Training data: MS Marco + MIRACL, with synthetic silver labelling of which sentences to keep, produced using aya-expanse-8b. Languages in the training data: Arabic, Bengali, English, Spanish, Persian, Finnish, France, Hindi, Indonesian, Japanese, Korean, Russian, Swahili, Telugu, Thai, Chinese Context length: 8192 tokens (similar to the pretrained BGE-m3 model). However, training data only included paragraph-sized examples. Evaluation: we evaluate XProvence on 26 languages from 6 different datasets. We find that XProvence is able to prune irrelevant sentences with little-to-no drop in performance, on all languages, and outperforms existing baselines on the Pareto front.
DUSt3R_ViTLarge_BaseDecoder_224_linear
xprovence-reranker-bgem3-v2
splade-v3-doc
pisco-mistral
trecdl22-crossencoder-debertav3
ecir23-scratch-tydi-russian-splade
multilingual-distilwhisper-3k
efficient-splade-V-large-query
cocom-v1-16-mistral-7b
mHuBERT-147-ASR-fr
multilingual-distilwhisper-10k
multilingual-distilwhisper-28k
pisco-llama
PISCO is a context compression model to be used for efficient inference when doing Retrieval Augmented Generation (RAG), particularly optimized for question answering. PISCO contains two adapters around a backbone LLM: - An encoder adapter trained to perform compression of input contexts (the retrieved documents in RAG) into a set of 8 embedding vectors - A decoder adapter, which can take as input sets of embeddings vectors from documents and a query and provide an answer With a compressed collection of documents to retrieve from, inference becomes about x5 faster. PISCO models have very small loss in accuracy on a wide set of QA benchmarks (0-3%). Developed by: Naver Labs Europe License: CC BY-NC 4.0. Model: `Pisco-llama` Backbone model: meta-llama/Llama-3.1-8B-Instruct Model size: 8.11 billion parameters Compression rate: x16: each document (of size up to 128 tokens) is converted into 8 embedding vectors. The recommended usage is to provide documents cropped to about 128 tokens, which is common practice when doing RAG. PISCO enables high accuracy responses from the compressed documents PISCO is robust to various domains We tested its compression/decoding abilities on various sets of data. PISCO enables x5 faster generation when the collection documents to retrieve from is pre-compressed. Model trained at Naver Labs Europe Team: Maxime LOUIS Hervé Dejean Stéphane Clinchant