Felladrin

298 models • 2 total models in database

Sort by:

gguf-jina-reranker-v1-tiny-en

Model creator: Jina AI Original model: jina-reranker-v1-tiny-en GGUF quantization: based on llama.cpp release f4d2b This model is designed for blazing-fast reranking while maintaining competitive performance. What's more, it leverages the power of our JinaBERT model as its foundation. `JinaBERT` itself is a unique variant of the BERT architecture that supports the symmetric bidirectional variant of ALiBi. This allows `jina-reranker-v1-tiny-en` to process significantly longer sequences of text compared to other reranking models, up to an impressive 8,192 tokens. To achieve the remarkable speed, the `jina-reranker-v1-tiny-en` employ a technique called knowledge distillation. Here, a complex, but slower, model (like our original jina-reranker-v1-base-en) acts as a teacher, condensing its knowledge into a smaller, faster student model. This student retains most of the teacher's knowledge, allowing it to deliver similar accuracy in a fraction of the time. Here's a breakdown of the reranker models we provide: | Model Name | Layers | Hidden Size | Parameters (Millions) | | ------------------------------------------------------------------------------------ | ------ | ----------- | --------------------- | | jina-reranker-v1-base-en | 12 | 768 | 137.0 | | jina-reranker-v1-turbo-en | 6 | 384 | 37.8 | | jina-reranker-v1-tiny-en | 4 | 384 | 33.0 | > Currently, the `jina-reranker-v1-base-en` model is not available on Hugging Face. You can access it via the Jina AI Reranker API. As you can see, the `jina-reranker-v1-turbo-en` offers a balanced approach with 6 layers and 37.8 million parameters. This translates to fast search and reranking while preserving a high degree of accuracy. The `jina-reranker-v1-tiny-en` prioritizes speed even further, achieving the fastest inference speeds with its 4-layer, 33.0 million parameter architecture. This makes it ideal for scenarios where absolute top accuracy is less crucial. 1. The easiest way to starting using `jina-reranker-v1-tiny-en` is to use Jina AI's Reranker API. 2. Alternatively, you can use the latest version of the `sentence-transformers>=0.27.0` library. You can install it via pip: Then, you can use the following code to interact with the model: 3. You can also use the `transformers` library to interact with the model programmatically. 4. You can also use the `transformers.js` library to run the model directly in JavaScript (in-browser, Node.js, Deno, etc.)! If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Then, you can use the following code to interact with the model: That's it! You can now use the `jina-reranker-v1-tiny-en` model in your projects. We evaluated Jina Reranker on 3 key benchmarks to ensure top-tier performance and search relevance. | Model Name | NDCG@10 (17 BEIR datasets) | NDCG@10 (5 LoCo datasets) | Hit Rate (LlamaIndex RAG) | | ------------------------------------------ | -------------------------- | ------------------------- | ------------------------- | | `jina-reranker-v1-base-en` | 52.45 | 87.31 | 85.53 | | `jina-reranker-v1-turbo-en` | 49.60 | 69.21 | 85.13 | | `jina-reranker-v1-tiny-en` (you are here) | 48.54 | 70.29 | 85.00 | | `mxbai-rerank-base-v1` | 49.19 | - | 82.50 | | `mxbai-rerank-xsmall-v1` | 48.80 | - | 83.69 | | `ms-marco-MiniLM-L-6-v2` | 48.64 | - | 82.63 | | `ms-marco-MiniLM-L-4-v2` | 47.81 | - | 83.82 | | `bge-reranker-base` | 47.89 | - | 83.03 | - `NDCG@10` is a measure of ranking quality, with higher scores indicating better search results. `Hit Rate` measures the percentage of relevant documents that appear in the top 10 search results. - The results of LoCo datasets on other models are not available since they do not support long documents more than 512 tokens. For more details, please refer to our benchmarking sheets. Join our Discord community and chat with other community members about ideas.

license:apache-2.0

710

TinyMistral-248M-Chat-v4

license:apache-2.0

401

Minueza-32M-Base

license:apache-2.0

391

gguf-sharded-Qwen2-0.5B-Instruct

NaNK

license:apache-2.0

361

gguf-Qwen1.5-0.5B-Chat

NaNK

—

262

gguf-flan-t5-small

license:apache-2.0

255

gguf-gemma-2b-orpo

NaNK

—

186

gguf-flan-t5-large

license:apache-2.0

128

gguf-Aira-2-355M

license:apache-2.0

110

gguf-Q2_K_S-Mixed-AutoRound-MiniMax-M2.1

NaNK

—

107

gguf-TinyMistral-248M-Chat-v2

NaNK

license:apache-2.0

107

gguf-pythia-1.4b-sft-full

NaNK

license:apache-2.0

106

gguf-multi-qa-MiniLM-L6-cos-v1

NaNK

—

101

gguf-Smol-Llama-101M-Chat-v1

NaNK

base_model:Felladrin/Smol-Llama-101M-Chat-v1

gguf-Qwen2-0.5B-Instruct

NaNK

license:apache-2.0

gguf-Q5_K_M-Qwen2.5-0.5B-Instruct

NaNK

llama-cpp

gguf-MobileLLaMA-1.4B-Chat

NaNK

base_model:mtgv/MobileLLaMA-1.4B-Chat

gguf-Phi-3-mini-4k-instruct

license:mit

gguf-openhermes-tinyllama-sft-qlora

base_model:Ritvik19/openhermes-tinyllama-sft-qlora

gguf-LaMini-Flan-T5-248M

license:cc-by-nc-4.0

gguf-flan-t5-base

license:apache-2.0

gguf-sharded-LaMini-Flan-T5-783M

license:cc-by-nc-4.0

gguf-q5_k_m-granite-3.0-2b-instruct

NaNK

—

gguf-WizardVicuna-pythia-410m-deduped

—

gguf-sharded-Aira-2-355M

license:apache-2.0

gguf-sharded-WizardVicuna-pythia-410m-deduped

—

gguf-Qwen1.5-0.5B-Chat_llamafy

NaNK

base_model:Minami-su/Qwen1.5-0.5B-Chat_llamafy

gguf-sharded-Qwen2-1.5B-Instruct

NaNK

license:apache-2.0

gguf-sharded-Qwen1.5-0.5B-Chat_llamafy

NaNK

base_model:Minami-su/Qwen1.5-0.5B-Chat_llamafy

gguf-Qwen2-1.5B-Instruct

NaNK

license:apache-2.0

gguf-SmolLM-135M-Instruct

—

gguf-Qwen2-0.5B-Instruct-llamafy

NaNK

base_model:Minami-su/Qwen2-0.5B-Instruct-llamafy

gguf-sharded-Qwen2-0.5B-Instruct-llamafy

NaNK

base_model:Minami-su/Qwen2-0.5B-Instruct-llamafy

gguf-sharded-gemma-2b-orpo

NaNK

—

gguf-sharded-UD-Q4_K_XL-Qwen3-0.6B

NaNK

—

gguf-sharded-Phi-3-mini-4k-instruct

license:mit

Llama-160M-Chat-v1

Language: en License: apache-2.0

llama

gguf-zephyr-220m-dpo-full

—

gguf-Llama-160M-Chat-v1

NaNK

base_model:Felladrin/Llama-160M-Chat-v1

gguf-h2o-danube3-500m-chat

—

gguf-spin_gpt2_medium_alpaca_e2

license:mit

Smol-Llama-101M-Chat-v1

llama

gguf-TinyMistral-248M-SFT-v4

NaNK

license:apache-2.0

gguf-LaMini-Flan-T5-77M

license:cc-by-nc-4.0

gguf-Q8_0-bge-reranker-v2-m3

NaNK

llama-cpp

gguf-gemma-2-2b-it-abliterated

NaNK

—

gguf-sharded-TinyMistral-248M-Chat-v2

NaNK

license:apache-2.0

gguf-1.5-Pints-2K-v0.1

NaNK

—

gguf-sharded-openhermes-1b-olmo-sft-qlora

NaNK

license:apache-2.0

gguf-internlm2-chat-1_8b

NaNK

—

gguf-sharded-internlm2-chat-1_8b

NaNK

—

gguf-t5-base-grammar-correction

—

gguf-Pythia-Chat-Base-7B

NaNK

license:apache-2.0

gguf-MicroLlama

base_model:keeeeenw/MicroLlama

gguf-TinySolar-248m-4k-code-instruct

license:apache-2.0

gguf-Q5_K_M-Fox-1-1.6B-Instruct-v0.1

NaNK

llama-cpp

gguf-sharded-flan-t5-large

license:apache-2.0

gguf-openhermes-1b-olmo-sft-qlora

NaNK

license:apache-2.0

gguf-zephyr-1b-olmo-sft-qlora

NaNK

license:apache-2.0

gguf-sharded-TinySolar-248m-4k-code-instruct

license:apache-2.0

gguf-Lite-Mistral-150M-v2-Instruct

—

gguf-prem-1B-chat

NaNK

license:apache-2.0

gguf-q8_0-madlad400-3b-mt

NaNK

—

gguf-sharded-Llama-160M-Chat-v1

NaNK

base_model:Felladrin/Llama-160M-Chat-v1

gguf-sharded-h2o-danube2-1.8b-chat

NaNK

license:apache-2.0

gguf-llama-160m

base_model:JackFram/llama-160m

gguf-Lite-Oute-1-65M-Instruct

—

gguf-Q8_0-Qwen2.5-0.5B-Instruct

NaNK

—

gguf-sharded-Qwen1.5-0.5B-Chat

NaNK

—

gguf-Q5_K_M-smollm-360M-instruct-add-basics

llama-cpp

gguf-sharded-falcon-mamba-7b-instruct

NaNK

—

gguf-Hare-1.1B-Chat

NaNK

—

gguf-OLMoE-1B-7B-0924-Instruct

NaNK

—

gguf-q5_k_m-madlad400-3b-mt

NaNK

—

gguf-sharded-prem-1B-chat

NaNK

license:apache-2.0

gguf-h2o-danube2-1.8b-chat

NaNK

license:apache-2.0

gguf-NuExtract-tiny

—

candle-quantized-LaMini-Flan-T5-248M

license:cc-by-nc-4.0

gguf-flan-t5-base-instruct-dolly_hhrlhf

license:cc-by-sa-3.0

gguf-Q4_K_M-Yi-1.5-6B-Chat

NaNK

llama-cpp

gguf-TinyMistral-248M-Chat-v1

NaNK

license:apache-2.0

gguf-smol_llama-220M-openhermes

base_model:BEE-spoke-data/smol_llama-220M-openhermes

gguf-sharded-MobileLLaMA-1.4B-Chat

NaNK

base_model:mtgv/MobileLLaMA-1.4B-Chat

gguf-DopeyTinyLlama-1.1B-v1

NaNK

base_model:vihangd/DopeyTinyLlama-1.1B-v1

gguf-flan-alpaca-base

license:apache-2.0

gguf-gpt2-chatbot

license:apache-2.0

gguf-sharded-zephyr-1b-olmo-sft-qlora

NaNK

license:apache-2.0

gguf-Sheared-Pythia-160m-Platypus

license:cc-by-nc-sa-4.0

gguf-t5-address-standardizer

—

gguf-sharded-Q5_K_L-Llama-3.2-3B-Instruct

NaNK

base_model:bartowski/Llama-3.2-3B-Instruct-GGUF

gguf-Q8_0-SmolLM2-360M-Instruct

GGUF version of HuggingFaceTB/SmolLM2-360M-Instruct.

—

gguf-Pythia-31M-Chat-v1

NaNK

license:apache-2.0

gguf-pythia-3b-deduped-sft

NaNK

license:apache-2.0

gguf-MaxMini-Instruct-248M

license:mit

gguf-SmolLM-360M-Instruct

—

gguf-sharded-Qwen2-1.5B-Instruct-imat

NaNK

llama-cpp

gguf-Q4_K_S-MiniCPM4-0.5B-QAT-Int4-unquantized

Felladrin/MiniCPM4-0.5B-QAT-Int4-unquantized-Q4KS-GGUF This model was converted to GGUF format from `openbmb/MiniCPM4-0.5B-QAT-Int4-unquantized` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

NaNK

llama-cpp

Pythia-31M-Chat-v1

license:apache-2.0

gguf-TinyMistral-248M-v2.5-Instruct-orpo

license:apache-2.0

gguf-q5_k_l-imat-arcee-lite

license:apache-2.0

gguf-Q3_K_XL-falcon-mamba-7b

NaNK

—

gguf-vicuna-68m

license:apache-2.0

gguf-TinyLlama-1.1B-1T-OpenOrca

NaNK

base_model:jeff31415/TinyLlama-1.1B-1T-OpenOrca

gguf-sharded-spin_gpt2_medium_alpaca_e2

license:mit

gguf-IPythia-410m

—

gguf-LaMini-Flan-T5-783M

license:cc-by-nc-4.0

gguf-TinySolar-248m-4k

license:apache-2.0

gguf-sharded-Q5_K_L-Llama-3.2-1B-Instruct

NaNK

base_model:bartowski/Llama-3.2-1B-Instruct-GGUF

gguf-Q5_K_M-NanoLM-1B-Instruct-v2

NaNK

llama-cpp

gguf-sharded-pythia-3b-deduped-sft

NaNK

license:apache-2.0

gguf-Aira-2-124M

license:apache-2.0

gguf-Aira-2-124M-DPO

license:apache-2.0

gguf-MiniMA-2-1B

NaNK

—

gguf-q5_k_m-phi-3.5-mini-instruct

llama-cpp

gguf-sharded-openhermes-tinyllama-sft-qlora

base_model:Ritvik19/openhermes-tinyllama-sft-qlora

gguf-sharded-Aira-2-124M-DPO

license:apache-2.0

gguf-sharded-IPythia-410m

—

gguf-sharded-TinyMistral-248M-v2.5-Instruct-orpo

license:apache-2.0

gguf-sharded-pythia-1.4b-sft-full

NaNK

license:apache-2.0

gguf-sharded-Aira-2-124M

license:apache-2.0

gguf-falcon-mamba-7b-instruct

NaNK

—

Minueza-2-96M-Instruct-Variant-10

llama

Minueza-32M-UltraChat

Language model with Apache 2.0 license.

license:apache-2.0

gguf-sharded-LaMini-Flan-T5-248M

llama-cpp

gguf-774M-03_09_2024

NaNK

license:mit

gguf-gpt2-alpaca-gpt4

NaNK

license:mit

gguf-flan-t5-small-finetuned-openai-summarize_from_feedback

—

gguf-sharded-gemma-2-2b-it-abliterated

NaNK

llama-cpp

gguf-Q5_K_M-Qwen3-4B-Merge-Variant-01

Felladrin/gguf-Q5KM-Qwen3-4B-Merge-Variant-01 This model was converted to GGUF format from `Felladrin/Qwen3-4B-Merge-Variant-01` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

NaNK

llama-cpp