onnx-community

500 models • 11 total models in database
Sort by:

Kokoro-82M-v1.0-ONNX

Kokoro is a frontier TTS model for its size of 82 million parameters (text in/audio out). - Usage - JavaScript - Python - Voices/Samples - Quantizations First, install the `kokoro-js` library from NPM using: ```python import os import numpy as np from onnxruntime import InferenceSession You can generate token ids as follows: 1. Convert input text to phonemes using https://github.com/hexgrad/misaki 2. Map phonemes to ids using https://huggingface.co/hexgrad/Kokoro-82M/blob/785407d1adfa7ae8fbef8ffd85f34ca127da3039/config.json#L34-L148 tokens = [50, 157, 43, 135, 16, 53, 135, 46, 16, 43, 102, 16, 56, 156, 57, 135, 6, 16, 102, 62, 61, 16, 70, 56, 16, 138, 56, 156, 72, 56, 61, 85, 123, 83, 44, 83, 54, 16, 53, 65, 156, 86, 61, 62, 131, 83, 56, 4, 16, 54, 156, 43, 102, 53, 16, 156, 72, 61, 53, 102, 112, 16, 70, 56, 16, 138, 56, 44, 156, 76, 158, 123, 56, 16, 62, 131, 156, 43, 102, 54, 46, 16, 102, 48, 16, 81, 47, 102, 54, 16, 54, 156, 51, 158, 46, 16, 70, 16, 92, 156, 135, 46, 16, 54, 156, 43, 102, 48, 4, 16, 81, 47, 102, 16, 50, 156, 72, 64, 83, 56, 62, 16, 156, 51, 158, 64, 83, 56, 16, 44, 157, 102, 56, 16, 44, 156, 76, 158, 123, 56, 4] Context length is 512, but leave room for the pad token 0 at the start & end assert len(tokens) Life is like a box of chocolates. You never know what you're gonna get. | Name | Nationality | Gender | Sample | | ------------ | ----------- | ------ | --------------------------------------------------------------------------------------------------------------------------------------- | | afheart | American | Female | | | afalloy | American | Female | | | afaoede | American | Female | | | afbella | American | Female | | | afjessica | American | Female | | | afkore | American | Female | | | afnicole | American | Female | | | afnova | American | Female | | | afriver | American | Female | | | afsarah | American | Female | | | afsky | American | Female | | | amadam | American | Male | | | amecho | American | Male | | | americ | American | Male | | | amfenrir | American | Male | | | amliam | American | Male | | | ammichael | American | Male | | | amonyx | American | Male | | | ampuck | American | Male | | | amsanta | American | Male | | | bfalice | British | Female | | | bfemma | British | Female | | | bfisabella | British | Female | | | bflily | British | Female | | | bmdaniel | British | Male | | | bmfable | British | Male | | | bmgeorge | British | Male | | | bmlewis | British | Male | | The model is resilient to quantization, enabling efficient high-quality speech synthesis at a fraction of the original model size. > How could I know? It's an unanswerable question. Like asking an unborn child if they'll lead a good life. They haven't even been born. | Model | Size (MB) | Sample | |------------------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------| | model.onnx (fp32) | 326 | | | modelfp16.onnx (fp16) | 163 | | | modelquantized.onnx (8-bit) | 92.4 | | | modelq8f16.onnx (Mixed precision) | 86 | | | modeluint8.onnx (8-bit & mixed precision) | 177 | | | modeluint8f16.onnx (Mixed precision) | 114 | | | modelq4.onnx (4-bit matmul) | 305 | | | modelq4f16.onnx (4-bit matmul & fp16 weights) | 154 | |

license:apache-2.0
47,563
162

gemma-4-E2B-it-ONNX

NaNK
license:apache-2.0
43,448
12

Medical-NER-ONNX

22,226
1

xlm-roberta-base-squad2-distilled-ONNX

This is an ONNX version of deepset/xlm-roberta-base-squad2-distilled. It was automatically converted and uploaded using this space.

21,923
0

t5-base-grammar-correction-ONNX

This is an ONNX version of vennify/t5-base-grammar-correction. It was automatically converted and uploaded using this space.

20,086
0

whisper-base

16,082
22

Kokoro-82M-ONNX

Kokoro is a frontier TTS model for its size of 82 million parameters (text in/audio out). - Samples - Usage - JavaScript - Python - Quantizations > Life is like a box of chocolates. You never know what you're gonna get. | Voice | Nationality | Gender | Sample | |--------------------------|-------------|--------|-----------------------------------------------------------------------------------------------------------------------------------------| | Default (`af`) | American | Female | | | Bella (`afbella`) | American | Female | | | Nicole (`afnicole`) | American | Female | | | Sarah (`afsarah`) | American | Female | | | Sky (`afsky`) | American | Female | | | Adam (`amadam`) | American | Male | | | Michael (`ammichael`) | American | Male | | | Emma (`bfemma`) | British | Female | | | Isabella (`bfisabella`) | British | Female | | | George (`bmgeorge`) | British | Male | | | Lewis (`bmlewis`) | British | Male | | First, install the `kokoro-js` library from NPM using: ```python import os import numpy as np from onnxruntime import InferenceSession Tokens produced by phonemize() and tokenize() in kokoro.py tokens = [50, 157, 43, 135, 16, 53, 135, 46, 16, 43, 102, 16, 56, 156, 57, 135, 6, 16, 102, 62, 61, 16, 70, 56, 16, 138, 56, 156, 72, 56, 61, 85, 123, 83, 44, 83, 54, 16, 53, 65, 156, 86, 61, 62, 131, 83, 56, 4, 16, 54, 156, 43, 102, 53, 16, 156, 72, 61, 53, 102, 112, 16, 70, 56, 16, 138, 56, 44, 156, 76, 158, 123, 56, 16, 62, 131, 156, 43, 102, 54, 46, 16, 102, 48, 16, 81, 47, 102, 54, 16, 54, 156, 51, 158, 46, 16, 70, 16, 92, 156, 135, 46, 16, 54, 156, 43, 102, 48, 4, 16, 81, 47, 102, 16, 50, 156, 72, 64, 83, 56, 62, 16, 156, 51, 158, 64, 83, 56, 16, 44, 157, 102, 56, 16, 44, 156, 76, 158, 123, 56, 4] Context length is 512, but leave room for the pad token 0 at the start & end assert len(tokens) How could I know? It's an unanswerable question. Like asking an unborn child if they'll lead a good life. They haven't even been born. | Model | Size (MB) | Sample | |------------------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------| | model.onnx (fp32) | 326 | | | modelfp16.onnx (fp16) | 163 | | | modelquantized.onnx (8-bit) | 92.4 | | | modelq8f16.onnx (Mixed precision) | 86 | | | modeluint8.onnx (8-bit & mixed precision) | 177 | | | modeluint8f16.onnx (Mixed precision) | 114 | | | modelq4.onnx (4-bit matmul) | 305 | | | modelq4f16.onnx (4-bit matmul & fp16 weights) | 154 | |

license:apache-2.0
13,739
152

embeddinggemma-300m-ONNX

Responsible Generative AI Toolkit EmbeddingGemma on Kaggle EmbeddingGemma on Vertex Model Garden EmbeddingGemma is a 300M parameter, state-of-the-art for its size, open embedding model from Google, built from Gemma 3 (with T5Gemma initialization) and the same research and technology used to create Gemini models. EmbeddingGemma produces vector representations of text, making it well-suited for search and retrieval tasks, including classification, clustering, and semantic similarity search. This model was trained with data in 100+ spoken languages. The small size and on-device focus makes it possible to deploy in environments with limited resources such as mobile phones, laptops, or desktops, democratizing access to state of the art AI models and helping foster innovation for everyone. - Input: - Text string, such as a question, a prompt, or a document to be embedded - Maximum input context length of 2048 tokens - Output: - Numerical vector representations of input text data - Output embedding dimension size of 768, with smaller options available (512, 256, or 128) via Matryoshka Representation Learning (MRL). MRL allows users to truncate the output embedding of size 768 to their desired size and then re-normalize for efficient and accurate representation. These model weights are designed to be used with Transformers.js. NOTE: EmbeddingGemma activations do not support `fp16` or its derivatives. Please use `fp32`, `q8`, or `q4` as appropriate for your hardware. Using the ONNX Runtime in Text Embeddings Inference (TEI) This model was trained on a dataset of text data that includes a wide variety of sources totaling approximately 320 billion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 100 languages. - Code and Technical Documents: Exposing the model to code and technical documentation helps it learn the structure and patterns of programming languages and specialized scientific content, which improves its understanding of code and technical questions. - Synthetic and Task-Specific Data: Synthetically training data helps to teach the model specific skills. This includes curated data for tasks like information retrieval, classification, and sentiment analysis, which helps to fine-tune its performance for common embedding applications. The combination of these diverse data sources is crucial for training a powerful multilingual embedding model that can handle a wide variety of different tasks and data formats. Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with our policies. EmbeddingGemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5e), for more details refer to the Gemma 3 model card. Training was done using JAX and ML Pathways. For more details refer to the Gemma 3 model card. The model was evaluated against a large collection of different datasets and metrics to cover different aspects of text understanding. Quant config (dimensionality) Mean (Task) Mean (TaskType) Quant config (dimensionality) Mean (Task) Mean (TaskType) Quant config (dimensionality) Mean (Task) Mean (TaskType) \ Mixed Precision refers to per-channel quantization with int4 for embeddings, feedforward, and projection layers, and int8 for attention (e4a8f4p4). EmbeddingGemma can generate optimized embeddings for various use cases—such as document retrieval, question answering, and fact verification—or for specific input types—either a query or a document—using prompts that are prepended to the input strings. Query prompts follow the form `task: {task description} | query: ` where the task description varies by the use case, with the default task description being `search result`. Document-style prompts follow the form `title: {title | "none"} | text: ` where the title is either `none` (the default) or the actual title of the document. Note that providing a title, if available, will improve model performance for document prompts but may require manual formatting. Use the following prompts based on your use case and input data type. These may already be available in the EmbeddingGemma configuration in your modeling framework of choice. Used to generate embeddings that are optimized for document search or information retrieval Used to generate embeddings that are optimized to classify texts according to preset labels Used to generate embeddings that are optimized to cluster texts based on their similarities Used to generate embeddings that are optimized to assess text similarity. This is not intended for retrieval use cases. Used to retrieve a code block based on a natural language query, such as sort an array or reverse a linked list . Embeddings of the code blocks are computed using retrievaldocument. These models have certain limitations that users should be aware of. Open embedding models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Semantic Similarity: Embeddings optimized to assess text similarity, such as recommendation systems and duplicate detection - Classification: Embeddings optimized to classify texts according to preset labels, such as sentiment analysis and spam detection - Clustering: Embeddings optimized to cluster texts based on their similarities, such as document organization, market research, and anomaly detection - Retrieval - Document: Embeddings optimized for document search, such as indexing articles, books, or web pages for search - Query: Embeddings optimized for general search queries, such as custom search - Code Query: Embeddings optimized for retrieval of code blocks based on natural language queries, such as code suggestions and search - Question Answering: Embeddings for questions in a question-answering system, optimized for finding documents that answer the question, such as chatbox. - Fact Verification: Embeddings for statements that need to be verified, optimized for retrieving documents that contain evidence supporting or refuting the statement, such as automated fact-checking systems. - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of embeddings. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the Gemma Prohibited Use Policy. - Privacy violations: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. At the time of release, this family of models provides high-performance open embedding model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown superior performance to other, comparably-sized open model alternatives.

13,447
38

moonshine-base-ONNX

license:mit
10,880
29

whisper-large-v3-turbo

10,027
66

Qwen3-Embedding-0.6B-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

NaNK
9,841
35

granite-docling-258M-ONNX

license:apache-2.0
7,388
1

nanochat-d32-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

NaNK
license:mit
6,827
5

depth-anything-v2-small

license:apache-2.0
4,868
23

whisper-small

4,780
0

gpt-oss-20b-ONNX

NaNK
license:apache-2.0
4,710
7

granite-4.0-1b-speech-ONNX

NaNK
license:apache-2.0
4,660
5

ormbg-ONNX

license:apache-2.0
3,986
10

granite-4.0-1b-ONNX-web

NaNK
license:apache-2.0
2,910
1

granite-4.0-micro-ONNX-web

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

license:apache-2.0
2,364
6

Qwen3-0.6B-ONNX

NaNK
2,218
35

whisper-medium-ONNX

2,147
0

whisper-base_timestamped

2,074
28

granite-4.0-350m-ONNX-web

license:apache-2.0
2,008
1

BEN2-ONNX

NaNK
license:mit
1,845
8

whisper-large-v3-turbo_timestamped

1,832
7

gemma-3-270m-it-ONNX

[Gemma 3 Technical Report][g3-tech-report] [Responsible Generative AI Toolkit][rai-toolkit] [Gemma on Kaggle][kaggle-gemma] [Gemma on Vertex Model Garden][vertex-mg-gemma3] Summary description and brief definition of inputs and outputs. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. - Input: - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B and 270M sizes. - Output: - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context up to 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B and 270M sizes per request, subtracting the request input tokens If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Data used for model training and how the data was processed. These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens, the 1B with 2 trillion tokens, and the 270M with 6 trillion tokens. The knowledge cutoff date for the training data was August 2024. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; "the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow." These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation. Evaluation results marked with IT are for instruction-tuned models. Evaluation results marked with PT are for pre-trained models. | Benchmark | n-shot | Gemma 3 PT 270M | | :------------------------ | :-----------: | ------------------: | | [HellaSwag][hellaswag] | 10-shot | 40.9 | | [BoolQ][boolq] | 0-shot | 61.4 | | [PIQA][piqa] | 0-shot | 67.7 | | [TriviaQA][triviaqa] | 5-shot | 15.4 | | [ARC-c][arc] | 25-shot | 29.0 | | [ARC-e][arc] | 0-shot | 57.7 | | [WinoGrande][winogrande] | 5-shot | 52.0 | [hellaswag]: https://arxiv.org/abs/1905.07830 [boolq]: https://arxiv.org/abs/1905.10044 [piqa]: https://arxiv.org/abs/1911.11641 [triviaqa]: https://arxiv.org/abs/1705.03551 [arc]: https://arxiv.org/abs/1911.01547 [winogrande]: https://arxiv.org/abs/1907.10641 | Benchmark | n-shot | Gemma 3 IT 270m | | :------------------------ | :-----------: | ------------------: | | [HellaSwag][hellaswag] | 0-shot | 37.7 | | [PIQA][piqa] | 0-shot | 66.2 | | [ARC-c][arc] | 0-shot | 28.2 | | [WinoGrande][winogrande] | 0-shot | 52.3 | | [BIG-Bench Hard][bbh] | few-shot | 26.7 | | [IF Eval][ifeval] | 0-shot | 51.2 | [hellaswag]: https://arxiv.org/abs/1905.07830 [piqa]: https://arxiv.org/abs/1911.11641 [arc]: https://arxiv.org/abs/1911.01547 [winogrande]: https://arxiv.org/abs/1907.10641 [bbh]: https://paperswithcode.com/dataset/bbh [bbh]: https://paperswithcode.com/dataset/bbh [ifeval]: https://arxiv.org/abs/2311.07911 | Benchmark | n-shot | Gemma 3 IT 1B | Gemma 3 IT 4B | Gemma 3 IT 12B | Gemma 3 IT 27B | |--------------------------------|--------|:-------------:|:-------------:|:--------------:|:--------------:| | [GPQA][gpqa] Diamond | 0-shot | 19.2 | 30.8 | 40.9 | 42.4 | | [SimpleQA][simpleqa] | 0-shot | 2.2 | 4.0 | 6.3 | 10.0 | | [FACTS Grounding][facts-grdg] | - | 36.4 | 70.1 | 75.8 | 74.9 | | [BIG-Bench Hard][bbh] | 0-shot | 39.1 | 72.2 | 85.7 | 87.6 | | [BIG-Bench Extra Hard][bbeh] | 0-shot | 7.2 | 11.0 | 16.3 | 19.3 | | [IFEval][ifeval] | 0-shot | 80.2 | 90.2 | 88.9 | 90.4 | | Benchmark | n-shot | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [gpqa]: https://arxiv.org/abs/2311.12022 [simpleqa]: https://arxiv.org/abs/2411.04368 [facts-grdg]: https://goo.gle/FACTSpaper [bbeh]: https://github.com/google-deepmind/bbeh [ifeval]: https://arxiv.org/abs/2311.07911 [hellaswag]: https://arxiv.org/abs/1905.07830 [boolq]: https://arxiv.org/abs/1905.10044 [piqa]: https://arxiv.org/abs/1911.11641 [socialiqa]: https://arxiv.org/abs/1904.09728 [triviaqa]: https://arxiv.org/abs/1705.03551 [naturalq]: https://github.com/google-research-datasets/natural-questions [arc]: https://arxiv.org/abs/1911.01547 [winogrande]: https://arxiv.org/abs/1907.10641 [bbh]: https://paperswithcode.com/dataset/bbh [drop]: https://arxiv.org/abs/1903.00161 | Benchmark | n-shot | Gemma 3 IT 1B | Gemma 3 IT 4B | Gemma 3 IT 12B | Gemma 3 IT 27B | |----------------------------|--------|:-------------:|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] (Pro) | 0-shot | 14.7 | 43.6 | 60.6 | 67.5 | | [LiveCodeBench][lcb] | 0-shot | 1.9 | 12.6 | 24.6 | 29.7 | | [Bird-SQL][bird-sql] (dev) | - | 6.4 | 36.3 | 47.9 | 54.4 | | [Math][math] | 0-shot | 48.0 | 75.6 | 83.8 | 89.0 | | HiddenMath | 0-shot | 15.8 | 43.0 | 54.5 | 60.3 | | [MBPP][mbpp] | 3-shot | 35.2 | 63.2 | 73.0 | 74.4 | | [HumanEval][humaneval] | 0-shot | 41.5 | 71.3 | 85.4 | 87.8 | | [Natural2Code][nat2code] | 0-shot | 56.0 | 70.3 | 80.7 | 84.5 | | [GSM8K][gsm8k] | 0-shot | 62.8 | 89.2 | 94.4 | 95.9 | | Benchmark | n-shot | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: https://arxiv.org/abs/2009.03300 [agieval]: https://arxiv.org/abs/2304.06364 [math]: https://arxiv.org/abs/2103.03874 [gsm8k]: https://arxiv.org/abs/2110.14168 [gpqa]: https://arxiv.org/abs/2311.12022 [mbpp]: https://arxiv.org/abs/2108.07732 [humaneval]: https://arxiv.org/abs/2107.03374 [lcb]: https://arxiv.org/abs/2403.07974 [bird-sql]: https://arxiv.org/abs/2305.03111 [nat2code]: https://arxiv.org/abs/2405.04520 | Benchmark | n-shot | Gemma 3 IT 1B | Gemma 3 IT 4B | Gemma 3 IT 12B | Gemma 3 IT 27B | |--------------------------------------|--------|:-------------:|:-------------:|:--------------:|:--------------:| | [Global-MMLU-Lite][global-mmlu-lite] | 0-shot | 34.2 | 54.5 | 69.5 | 75.1 | | [ECLeKTic][eclektic] | 0-shot | 1.4 | 4.6 | 10.3 | 16.7 | | [WMT24++][wmt24pp] | 0-shot | 35.9 | 46.8 | 51.6 | 53.4 | | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: https://arxiv.org/abs/2210.03057 [flores]: https://arxiv.org/abs/2106.03193 [xquad]: https://arxiv.org/abs/1910.11856v3 [global-mmlu-lite]: https://huggingface.co/datasets/CohereForAI/Global-MMLU-Lite [wmt24pp]: https://arxiv.org/abs/2502.12404v1 [eclektic]: https://arxiv.org/abs/2502.21228 [indicgenbench]: https://arxiv.org/abs/2404.16816 | Benchmark | Gemma 3 IT 4B | Gemma 3 IT 12B | Gemma 3 IT 27B | |-----------------------------------|:-------------:|:--------------:|:--------------:| | [MMMU][mmmu] (val) | 48.8 | 59.6 | 64.9 | | [DocVQA][docvqa] | 75.8 | 87.1 | 86.6 | | [InfoVQA][info-vqa] | 50.0 | 64.9 | 70.6 | | [TextVQA][textvqa] | 57.8 | 67.7 | 65.1 | | [AI2D][ai2d] | 74.8 | 84.2 | 84.5 | | [ChartQA][chartqa] | 68.8 | 75.7 | 78.0 | | [VQAv2][vqav2] (val) | 62.4 | 71.6 | 71.0 | | [MathVista][mathvista] (testmini) | 50.0 | 62.9 | 67.6 | | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: https://cocodataset.org/#home [docvqa]: https://www.docvqa.org/ [info-vqa]: https://arxiv.org/abs/2104.12756 [mmmu]: https://arxiv.org/abs/2311.16502 [textvqa]: https://textvqa.org/ [realworldqa]: https://paperswithcode.com/dataset/realworldqa [remi]: https://arxiv.org/html/2406.09175v1 [ai2d]: https://allenai.org/data/diagrams [chartqa]: https://arxiv.org/abs/2203.10244 [vqav2]: https://visualqa.org/index.html [blinkvqa]: https://arxiv.org/abs/2404.12390 [okvqa]: https://okvqa.allenai.org/ [tallyqa]: https://arxiv.org/abs/1810.12440 [ss-vqa]: https://arxiv.org/abs/1908.02660 [countbenchqa]: https://github.com/google-research/bigvision/blob/main/bigvision/datasets/countbenchqa/ [mathvista]: https://arxiv.org/abs/2310.02255 Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - Child Safety: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - Content Safety: Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - Representational Harms: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct "assurance evaluations" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. These models have certain limitations that users should be aware of. Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. - Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - Privacy violations: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: https://arxiv.org/abs/2503.19786 [rai-toolkit]: https://ai.google.dev/responsible [kaggle-gemma]: https://www.kaggle.com/models/google/gemma-3 [vertex-mg-gemma3]: https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/gemma3 [terms]: https://ai.google.dev/gemma/terms [safety-policies]: https://ai.google/static/documents/ai-responsibility-update-published-february-2025.pdf [prohibited-use]: https://ai.google.dev/gemma/prohibitedusepolicy [tpu]: https://cloud.google.com/tpu/docs/intro-to-tpu [sustainability]: https://sustainability.google/operating-sustainably/ [jax]: https://github.com/jax-ml/jax [ml-pathways]: https://blog.google/technology/ai/introducing-pathways-next-generation-ai-architecture/ [sustainability]: https://sustainability.google/operating-sustainably/ [gemini-2-paper]: https://arxiv.org/abs/2312.11805

1,810
26

whisper-base.en

1,734
1

gemma-4-E4B-it-ONNX

NaNK
license:apache-2.0
1,666
6

Supertonic-TTS-ONNX

1,625
11

whisper-tiny.en

1,606
0

whisper-tiny

1,569
0

gliner_base

1,513
0

dinov3-vits16-pretrain-lvd1689m-ONNX

1,453
13

cohere-transcribe-03-2026-ONNX

license:apache-2.0
1,406
4

FastVLM-0.5B-ONNX

FastVLM: Efficient Vision Encoding for Vision Language Models FastVLM was introduced in FastVLM: Efficient Vision Encoding for Vision Language Models. (CVPR 2025) Try it out using the online demo, which runs 100% locally in your browser with Transformers.js! If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

NaNK
1,139
92

Janus-Pro-1B-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

NaNK
license:mit
1,007
49

mobilenetv4_conv_small.e2400_r224_in1k

NaNK
994
0

Qwen2.5-0.5B-Instruct

NaNK
986
8

Qwen3.5-0.8B-ONNX

NaNK
license:apache-2.0
953
6

SmolLM2-135M-ONNX

llama
911
2

Kokoro-82M-v1.0-ONNX-timestamped

license:apache-2.0
899
4

DeepSeek-R1-Distill-Qwen-1.5B-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Text-generation w/ `onnx-community/DeepSeek-R1-Distill-Qwen-1.5B-ONNX` Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

NaNK
887
60

Florence-2-base-ft

license:mit
884
34

pyannote-segmentation-3.0

NaNK
license:mit
781
39

gliner_small-v2.1

NaNK
715
2

bge-reranker-v2-m3-ONNX

NaNK
661
2

functiongemma-270m-it-ONNX

659
7

siglip2-base-patch16-256-ONNX

NaNK
552
1

moonshine-tiny-ONNX

license:mit
538
7

dinov3-vits16-pretrain-lvd1689m-ONNX-MHA-scores

521
2

Phi-3.5-mini-instruct-onnx-web

NaNK
license:mit
497
15

LFM2-350M-ONNX

461
6

chatterbox-multilingual-ONNX

Chatterbox Multilingual Resemble AI's production-grade open source TTS model. Chatterbox Multilingual supports Arabic, Danish, German, Greek, English, Spanish, Finnish, French, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese, Russian, Swedish, Swahili, Turkish, Chinese out of the box. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs, and is consistently preferred in side-by-side evaluations. Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out. Chatterbox is provided in an exported ONNX format, enabling fast and portable inference with ONNX Runtime across platforms. Key Details - SoTA zeroshot English TTS - 0.5B Llama backbone - Unique exaggeration/intensity control - Ultra-stable with alignment-informed inference - Trained on 0.5M hours of cleaned data - Watermarked outputs (optional) - Easy voice conversion script using onnxruntime - Outperforms ElevenLabs Tips - General Use (TTS and Voice Agents): - The default settings (`exaggeration=0.5`, `cfg=0.5`) work well for most prompts. - Expressive or Dramatic Speech: - Try increase `exaggeration` to around `0.7` or higher. - Higher `exaggeration` tends to speed up speech; Usage Link to GitHub ONNX Export and Inference script Acknowledgements - Xenova - Vladislav Bronzov - Resemble AI Every audio file generated by Chatterbox includes Resemble AI's Perth (Perceptual Threshold) Watermarker - imperceptible neural watermarks that survive MP3 compression, audio editing, and common manipulations while maintaining nearly 100% detection accuracy. Disclaimer Don't use this model to do bad things. Prompts are sourced from freely available data on the internet.

license:mit
419
15

gemma-3-1b-it-ONNX-GQA

NaNK
400
16

gte-multilingual-reranker-base

373
6

Qwen3.5-2B-ONNX

NaNK
373
3

Llama-3.2-1B-Instruct-q4f16

NaNK
llama
368
14

kitten-tts-nano-0.1-ONNX

NaNK
license:apache-2.0
318
14

LFM2-24B-A2B-ONNX

NaNK
309
0

gte-multilingual-base

306
9

whisper-small_timestamped

305
2

LFM2-1.2B-ONNX

LFM2 is a new generation of hybrid models developed by Liquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency. We're releasing the weights of three post-trained checkpoints with 350M, 700M, and 1.2B parameters. They provide the following key features to create AI-powered edge applications: Fast training & inference – LFM2 achieves 3x faster training compared to its previous generation. It also benefits from 2x faster decode and prefill speed on CPU compared to Qwen3. Best performance – LFM2 outperforms similarly-sized models across multiple benchmark categories, including knowledge, mathematics, instruction following, and multilingual capabilities. New architecture – LFM2 is a new hybrid Liquid model with multiplicative gates and short convolutions. Flexible deployment – LFM2 runs efficiently on CPU, GPU, and NPU hardware for flexible deployment on smartphones, laptops, or vehicles. Due to their small size, we recommend fine-tuning LFM2 models on narrow use cases to maximize performance. They are particularly suited for agentic tasks, data extraction, RAG, creative writing, and multi-turn conversations. However, we do not recommend using them for tasks that are knowledge-intensive or require programming skills. | Property | Value | | ------------------- | ----------------------------- | | Parameters | 1,170,340,608 | | Layers | 16 (10 conv + 6 attn) | | Context length | 32,768 tokens | | Vocabulary size | 65,536 | | Precision | bfloat16 | | Training budget | 10 trillion tokens | | License | LFM Open License v1.0 | Supported languages: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish. Generation parameters: We recommend the following parameters: `temperature=0.3` `minp=0.15` `repetitionpenalty=1.05` Architecture: Hybrid model with multiplicative gates and short convolutions: 10 double-gated short-range LIV convolution blocks and 6 grouped query attention (GQA) blocks. Pre-training mixture: Approximately 75% English, 20% multilingual, and 5% code data sourced from the web and licensed materials. Training approach: Knowledge distillation using LFM1-7B as teacher model Very large-scale SFT on 50% downstream tasks, 50% general domains Custom DPO with length normalization and semi-online datasets Iterative model merging If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

NaNK
279
13

language_detection-ONNX

266
4

Qwen3-1.7B-ONNX

NaNK
239
4

Voxtral-Mini-3B-2507-ONNX

Voxtral Mini is an enhancement of Ministral 3B, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. This repository contains ONNX weights for the original model, mistralai/Voxtral-Mini-3B-2507. Voxtral builds upon Ministral-3B with powerful audio understanding capabilities. - Dedicated transcription mode: Voxtral can operate in a pure speech transcription mode to maximize performance. By default, Voxtral automatically predicts the source audio language and transcribes the text accordingly - Long-form context: With a 32k token context length, Voxtral handles audios up to 30 minutes for transcription, or 40 minutes for understanding - Built-in Q&A and summarization: Supports asking questions directly through audio. Analyze audio and generate structured summaries without the need for separate ASR and language models - Natively multilingual: Automatic language detection and state-of-the-art performance in the world’s most widely used languages (English, Spanish, French, Portuguese, Hindi, German, Dutch, Italian) - Function-calling straight from voice: Enables direct triggering of backend functions, workflows, or API calls based on spoken user intents - Highly capable at text: Retains the text understanding capabilities of its language model backbone, Ministral-3B Average word error rate (WER) over the FLEURS, Mozilla Common Voice and Multilingual LibriSpeech benchmarks: - `temperature=0.2` and `topp=0.95` for chat completion (e.g. Audio Understanding) and `temperature=0.0` for transcription - Multiple audios per message and multiple user turns with audio are supported - System prompts are not yet supported If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

NaNK
license:apache-2.0
237
24

gemma-3-1b-it-ONNX

NaNK
234
23

Falcon-H1-Tiny-90M-Instruct-ONNX

219
1

depth-anything-v2-base

license:cc-by-nc-4.0
217
0

yolov10x

license:agpl-3.0
202
6

Qwen3.5-4B-ONNX

NaNK
201
2

yolov10m

license:agpl-3.0
200
6

Llama-3.2-1B-Instruct

NaNK
llama
181
26

LFM2-700M-ONNX

172
4

ultravox-v0_5-llama-3_2-1b-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

NaNK
base_model:fixie-ai/ultravox-v0_5-llama-3_2-1b
166
5

Supertonic-TTS-2-ONNX

NaNK
160
5

bge-small-en-v1.5-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: You can then use the model to compute embeddings, as follows: You can also use the model for retrieval. For example: Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

NaNK
156
0

deberta-v3-large-zeroshot-v2.0-c-ONNX

152
0

Qwen3-4B-ONNX

NaNK
143
4

twitter-roberta-base-sentiment-ONNX

137
1

Stella Large Zh V2 ONNX

This is an ONNX version of infgrad/stella-large-zh-v2. It was automatically converted and uploaded using this space.

NaNK
136
1

dinov2-with-registers-small-with-attentions

134
0

OuteTTS-0.2-500M

license:cc-by-nc-4.0
126
14

whisper-base-ONNX

This is an ONNX version of openai/whisper-base. It was automatically converted and uploaded using this space.

122
0

Qwen3-VL-8B-Instruct-ONNX

NaNK
license:apache-2.0
121
0

LightOnOCR-2-1B-ONNX

NaNK
license:apache-2.0
119
3

tiny-random-MarianMTModel

114
0

tiny-random-LlamaForCausalLM-ONNX

llama
113
0

harrier-oss-v1-270m-ONNX

license:mit
111
1

tiny-random-olmo-hf

110
0

tiny-random-jais

108
0

Qwen2.5-1.5B-Instruct

NaNK
103
5

Llama-3.2-1B

NaNK
llama
95
15

gliner_multi_pii-v1

NaNK
93
6

whisper-tiny.en_timestamped

93
1

granite-timeseries-patchtst

93
1

mediapipe_selfie_segmentation

NaNK
license:apache-2.0
92
5

grounding-dino-tiny-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Zero-shot object detection with `onnx-community/grounding-dino-tiny-ONNX` using the `pipeline` API. Example: Zero-shot object detection with `onnx-community/grounding-dino-tiny-ONNX` using the `AutoModel` API.

license:apache-2.0
92
5

Janus-1.3B-ONNX

NaNK
91
16

granite-timeseries-patchtsmixer

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Time series forecasting w/ `onnx-community/granite-timeseries-patchtsmixer` Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

89
0

ISNet-ONNX

license:agpl-3.0
88
2

gemma-3n-E2B-it-ONNX

NaNK
86
35

Llama-3.2-3B-Instruct-ONNX

NaNK
llama
86
12

bert-base-uncased-ONNX

license:apache-2.0
84
0

depth-anything-v2-large

license:cc-by-nc-4.0
83
6

LFM2-8B-A1B-ONNX

NaNK
83
1

tiny-random-MgpstrForSceneTextRecognition

license:apache-2.0
83
0

NuNER_Zero-span

81
2

Qwen2-VL-2B-Instruct

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: ONNX conversion script: First, install the following dependencies:

NaNK
license:apache-2.0
80
11

NVIDIA-Nemotron-3-Nano-4B-BF16-ONNX

NaNK
80
0

Llama-3.2-1B-Instruct-ONNX

NaNK
llama
79
27

metaclip-2-worldwide-huge-378-ONNX

MetaCLIP 2 (worldwide) was presented in MetaCLIP 2: A Worldwide Scaling Recipe. This checkpoint corresponds to ONNX implementation of the original implementation. First install the optimum-onnx library (from source for now):

license:cc-by-nc-4.0
79
0

Qwen2.5-Coder-0.5B-Instruct

NaNK
78
3

Voxtral-Mini-4B-Realtime-2602-ONNX

NaNK
license:apache-2.0
73
3

Qwen2.5-0.5B

NaNK
73
0

LFM2-VL-450M-ONNX

72
5

siglip2-base-patch16-224-ONNX

NaNK
70
2

SmolLM2-135M-Instruct-ONNX

llama
68
0

Qwen3-VL-4B-Instruct-ONNX

NaNK
license:apache-2.0
67
1

Llama-3.2-3B-Instruct-onnx-web

NaNK
llama
66
3

whisper-tiny_timestamped

66
1

modnet-webnn

license:apache-2.0
65
5

WavTokenizer-large-speech-75token_decode

license:mit
65
1

maskformer-resnet50-ade20k-full

65
0

yolov10n

license:agpl-3.0
64
6

dinov3-vitl16-pretrain-sat493m-ONNX

64
2

BiRefNet_lite-ONNX

license:mit
63
11

granite-4.0-h-1b-ONNX

NaNK
license:apache-2.0
63
0

TinySwallow-1.5B-Instruct-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: You can then use the model to generate text like this: Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

NaNK
62
1

functiongemma-270m-it-ONNX-GQA

62
0

BiRefNet-ONNX

license:mit
59
6

roberta-base-openai-detector-ONNX

58
0

Llama-3.2-3B-Instruct

NaNK
llama
57
12

parakeet-ctc-0.6b-ONNX

NaNK
license:cc-by-4.0
57
2

whisper-small.en

57
0

Trinity-Nano-Preview-ONNX

license:apache-2.0
56
0

distil-small.en

54
1

gliner_multi-v2.1

NaNK
52
5

Florence-2-large-ft

license:mit
50
7

age-gender-prediction-ONNX

license:apache-2.0
48
1

yolo26n-ONNX

48
0

piiranha-v1-detect-personal-information-ONNX

48
0

Qwen3-4B-Thinking-2507-ONNX

NaNK
license:apache-2.0
47
0

vitpose-base-simple

46
3

EdgeTAM-ONNX

license:apache-2.0
45
1

SmolLM-360M-ONNX

llama
45
0

ZR1-1.5B-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

NaNK
44
1

whisper-base.en_timestamped

44
0

Chatterbox ONNX

license:mit
43
11

Qwen2.5-Coder-3B-Instruct

Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

NaNK
43
7

whisper-medium.en_timestamped

license:mit
43
1

distilbert-base-multilingual-cased-ONNX

This is an ONNX version of distilbert/distilbert-base-multilingual-cased. It was automatically converted and uploaded using this space.

43
1

Florence-2-large

license:mit
40
14

Kokoro-82M-v1.1-zh-ONNX

license:apache-2.0
40
12

mobilenet_v2_1.0_224

NaNK
39
1

Qwen3-8B-ONNX

NaNK
license:apache-2.0
39
0

Jan-nano-ONNX

license:apache-2.0
37
2

gemma-2-2b-jpn-it

NaNK
37
1

dinov3-convnext-large-pretrain-lvd1689m-ONNX

36
0

LFM2-350M-ENJP-MT-ONNX

Based on the LFM2-350M model, this checkpoint has been fine-tuned for near real-time bi-directional Japanese/English translation of short-to-medium inputs. LFM2-350M-ENJP-MT delivers translation quality that is on par with models more than 10 times its size. Below are sample translations produced by the model. These examples are meant to give you a feel for its strengths and typical style in both directions (English ➡️ Japanese and Japanese ➡️ English). They include a mix of everyday text, technical descriptions, business communication, and news reporting, so you can gauge performance across different domains. These examples demonstrate the model's strength in product descriptions, technical passages, and formal explanations when translating into Japanese. Fully Tested and Works Properly. 6 Months Warranty included! Item pictured is the actual item for sale. See above for full description, condition, and comments. 「完全試験済みで正しく動作しています。保証期間は6ヶ月付属!」。 写真に写っている商品が販売されている実物です。 詳しく、状態、コメントは上記参照してください。 Emphasis on human-AI collaboration. Instead of focusing solely on making fully autonomous AI systems, we are excited to build multimodal systems that work with people collaboratively. 人とAIのコラボレーションに重点を置く。完全自律型AIシステムの構築にのみ焦点を当てるのではなく、人と協調して働くマルチモーダルシステムを構築できることに興奮しています。 If your equipment fails due to normal use, please contact our customer service department so that we can assist you, We will repair or replace your equipment at our discretion. In some situations, we may choose to refund the full purchase price of an item. ご使用中の機器が通常使用により故障した場合は、お手伝いできるよう弊社カスタマーサービス部門にご連絡ください。 弊社の判断で機器の修理または交換を行います。状況によっては、製品の購入価格全額を返金する場合があります。 2k USD to start for basic, 200 dollars for additional version. - 50% of full amount of deposit, - 3 proposals - end of month(3 drafts), will choose 1 and make final changes based on it - Present another final version in a week 基本版から始めるのに2,000ドル、追加バージョンでは200ドルの手数料が必要です。 - 保証金全額の50%が支払われる、 - 3つの案 - 月末(ドラフト3回分)、その案に基づいて1つを選んで最終的な変更を行う - さらに1週間後に別の最終版を提出すること Lifestyle risk factors with strong evidence include lack of exercise, cigarette smoking, alcohol, and obesity. The risk of colon cancer can be reduced by maintaining a normal body weight through a combination of sufficient exercise and eating a healthy diet. 強力な証拠がある生活習慣のリスク要因としては、運動不足、喫煙、飲酒、肥満などが挙げられ、十分な運動と健康的な食生活の組み合わせによる正常な体重維持を通じて、大腸がんの発症リスクを減らすことができる。 These examples demonstrate the model’s ability to preserve nuance in news reporting, colloquial phrasing, and business contexts when translating into English. モデルからの回答は英語でもOKなのですよね。 The answers from the models are okay in English, right? 手間のかかるメルマガ作成作業、もっとラクに、もっと速くできたら——。 そう考えたことはありませんか? Have you ever wondered if you could create a cumbersome email newsletter more easily and quickly? X JAPANのYOSHIKIが、アニメ『ダンダダン』でグループの代表曲をオマージュした劇中歌が使用されたことを指摘して始まった議論。 8月22日には『ダンダダン』サイドが公式Xで騒動を謝罪、YOSHIKIも『ダンダダン』サイドと和解を報告したが、これに物言いをつけたのが、弁護士の紀藤正樹氏だった。 The discussion began with the point that Yoshiki of X JAPAN mentioned that a song in the anime Dandadan paying homage to the group's signature tune was used as an insert song. On August 22nd, the Dandadan side apologized on their official X page for the controversy, and Yoshiki also reported a reconciliation with the Dandadan side, but lawyer Masaki Kitō objected. (ブルームバーグ): SOMPOホールディングスは27日夜、米国などを中心に展開する損害保険会社のアスペン・インシュアランス・ホールディングスを買収すると発表した。買収総額は約5200億円となる。 ニューヨーク証券取引所に上場しているアスペンの株式を1株当たり37.5ドル(約5600円)で全株を取得する。26日の終値を16%上回る水準。2026年上期中に買収手続きを完了する予定。 買収資金は手元資金を充てる。 SOMPOにとっては17年に米損保エンデュランス・スペシャルティ・ホールディングスを約6400億円で買収して以来の大型案件となる。 人口減少で国内市場の縮小が見込まれる中、買収によって海外保険ビジネスの規模や収益を拡大し、再保険取引による安定的な収益の寄与も見込む。 (Bloomberg): SOMPO Holdings announced on the evening of the 27th that it will acquire Aspen Insurance Holdings, a non-life insurance company operating primarily in the United States and elsewhere, for approximately ¥520 billion. The acquisition will involve the purchase of all shares of Aspen’s shares listed on the New York Stock Exchange for $37.5 per share (approximately ¥5,600). This surpasses the closing price of the day by 16% and is scheduled to be completed within the first half of 2026. Funds for the acquisition will be provided from the company’s own capital. For SOMPO, this is the largest acquisition since its 2017 acquisition of Endurance Specialty Holdings for approximately ¥640 billion. The acquisition is expected to expand the scale and revenue of its overseas insurance business amidst anticipated shrinking domestic markets due to population decline, and is also expected to contribute to stable revenue through reinsurance transactions. 28歳にしてつかんだイングランドサッカー界でのチャンスを生かせるか。 チャンピオンシップ(英2部)の古豪ブラックバーンに電撃移籍した森下龍矢は意気込んでいる。 サガン鳥栖と名古屋グランパスでプレーし、2024年から海を渡ってレギア・ワルシャワで奮闘してきた森下は先日、大橋祐紀のチームメイトとなることが決まった。 日本ではSBが主戦場だった森下だが、昨季はポーランドで攻撃的なポジションにコンバートされ、ウィングやトップ下に前線と様々な役割をこなした。 すると、公式戦で14得点、14アシストとブレイク。 この飛躍に注目したブラックバーンに引き抜かれている。 Can he capitalize on his chance in English football, which he seized at the age of 28? Ryuya Morishita, having made a shocking move to Blackburn Rovers, a long-established club in the Championship (British second tier), is eager to make an impression. Having played for Sagan Tosu and Nagoya Grampus, and having been striving with Legia Warsaw since 2024, Morishita recently announced he would become teammates with Yuki Ohashi. For Morishita, his primary playing field in Japan was as a full-back, but he was converted to an attacking position in Poland last season, playing in various roles including wing-back and attacking midfielder. He then broke through, scoring 14 goals and providing 14 assists in official matches. The Blackburn club has been scouting for this promising player. > [!NOTE] > 📝 While LFM2-350M-ENJP-MT delivers strong out-of-the-box general-purpose English ↔️ Japanese translation, our primary > goal is to provide a versatile, community-empowering base model—a foundation designed to make it easy to build > best-in-class, task-specific translation systems. > > Like any base model, there are open areas for growth—in particular with extreme context lengths and specialized or > context-sensitive translations, such as: > - Technical & professional language (medical, legal, engineering) > - Novel proper nouns (new products, brands, cultural references) > - Industry-, domain-, or company-specific nuance (e-commerce, finance, internal corporate terminology) > > These are precisely the kinds of challenges that fine-tuning—by both Liquid AI and our developer community—can > address. We see this model not just as an endpoint, but as a catalyst for a rich ecosystem of fine-tuned translation > models tailored to real-world needs. Generation parameters: We strongly recommend using greedy decoding with a `temperature=0`. System prompts: LFM2-ENJP-MT requires one of the two following system prompts: "Translate to Japanese." for English to Japanese translation. "Translate to English." for Japanese to English translation. > [!WARNING] > ⚠️ The model cannot work as intended without one of these two system prompts. Chat template: LFM2-ENJP-MT uses a ChatML-like chat template as follows: You can automatically apply it using the dedicated `.applychattemplate()` function from Hugging Face transformers. > [!WARNING] > ⚠️ The model is intended for single turn conversations. - Huggingface: LFM2-350M - llama.cpp: LFM2-350M-ENJP-MT-GGUF - LEAP: LEAP model library If you are interested in custom solutions with edge deployment, please contact our sales team. LFM2-350Mモデルをベースに、本チェックポイントは短文から中程度の入力に対する 日本語/英語の双方向リアルタイム翻訳 用にファインチューニングされています。 以下は本モデルが生成した翻訳例です。英語➡️日本語、日本語➡️英語の両方向における強みと典型的なスタイルを示しています。 Fully Tested and Works Properly. 6 Months Warranty included! Item pictured is the actual item for sale. See above for full description, condition, and comments. 「完全試験済みで正しく動作しています。保証期間は6ヶ月付属!」。 写真に写っている商品が販売されている実物です。 詳しく、状態、コメントは上記参照してください。 Emphasis on human-AI collaboration. Instead of focusing solely on making fully autonomous AI systems, we are excited to build multimodal systems that work with people collaboratively. 人とAIのコラボレーションに重点を置く。完全自律型AIシステムの構築にのみ焦点を当てるのではなく、人と協調して働くマルチモーダルシステムを構築できることに興奮しています。 If your equipment fails due to normal use, please contact our customer service department so that we can assist you, We will repair or replace your equipment at our discretion. In some situations, we may choose to refund the full purchase price of an item. ご使用中の機器が通常使用により故障した場合は、お手伝いできるよう弊社カスタマーサービス部門にご連絡ください。 弊社の判断で機器の修理または交換を行います。状況によっては、製品の購入価格全額を返金する場合があります。 2k USD to start for basic, 200 dollars for additional version. - 50% of full amount of deposit, - 3 proposals - end of month(3 drafts), will choose 1 and make final changes based on it - Present another final version in a week 基本版から始めるのに2,000ドル、追加バージョンでは200ドルの手数料が必要です。 - 保証金全額の50%が支払われる、 - 3つの案 - 月末(ドラフト3回分)、その案に基づいて1つを選んで最終的な変更を行う - さらに1週間後に別の最終版を提出すること Lifestyle risk factors with strong evidence include lack of exercise, cigarette smoking, alcohol, and obesity. The risk of colon cancer can be reduced by maintaining a normal body weight through a combination of sufficient exercise and eating a healthy diet. 強力な証拠がある生活習慣のリスク要因としては、運動不足、喫煙、飲酒、肥満などが挙げられ、十分な運動と健康的な食生活の組み合わせによる正常な体重維持を通じて、大腸がんの発症リスクを減らすことができる。 これらの例は、ニュース記事のニュアンス、口語表現、ビジネス文脈を保ちながら英語に翻訳できるモデルの能力を示しています。 モデルからの回答は英語でもOKなのですよね。 The answers from the models are okay in English, right? 手間のかかるメルマガ作成作業、もっとラクに、もっと速くできたら——。 そう考えたことはありませんか? Have you ever wondered if you could create a cumbersome email newsletter more easily and quickly? X JAPANのYOSHIKIが、アニメ『ダンダダン』でグループの代表曲をオマージュした劇中歌が使用されたことを指摘して始まった議論。 8月22日には『ダンダダン』サイドが公式Xで騒動を謝罪、YOSHIKIも『ダンダダン』サイドと和解を報告したが、これに物言いをつけたのが、弁護士の紀藤正樹氏だった。 The discussion began with the point that Yoshiki of X JAPAN mentioned that a song in the anime Dandadan paying homage to the group's signature tune was used as an insert song. On August 22nd, the Dandadan side apologized on their official X page for the controversy, and Yoshiki also reported a reconciliation with the Dandadan side, but lawyer Masaki Kitō objected. (ブルームバーグ): SOMPOホールディングスは27日夜、米国などを中心に展開する損害保険会社のアスペン・インシュアランス・ホールディングスを買収すると発表した。買収総額は約5200億円となる。 ニューヨーク証券取引所に上場しているアスペンの株式を1株当たり37.5ドル(約5600円)で全株を取得する。26日の終値を16%上回る水準。2026年上期中に買収手続きを完了する予定。 買収資金は手元資金を充てる。 SOMPOにとっては17年に米損保エンデュランス・スペシャルティ・ホールディングスを約6400億円で買収して以来の大型案件となる。 人口減少で国内市場の縮小が見込まれる中、買収によって海外保険ビジネスの規模や収益を拡大し、再保険取引による安定的な収益の寄与も見込む。 (Bloomberg): SOMPO Holdings announced on the evening of the 27th that it will acquire Aspen Insurance Holdings, a non-life insurance company operating primarily in the United States and elsewhere, for approximately ¥520 billion. The acquisition will involve the purchase of all shares of Aspen’s shares listed on the New York Stock Exchange for $37.5 per share (approximately ¥5,600). This surpasses the closing price of the day by 16% and is scheduled to be completed within the first half of 2026. Funds for the acquisition will be provided from the company’s own capital. For SOMPO, this is the largest acquisition since its 2017 acquisition of Endurance Specialty Holdings for approximately ¥640 billion. The acquisition is expected to expand the scale and revenue of its overseas insurance business amidst anticipated shrinking domestic markets due to population decline, and is also expected to contribute to stable revenue through reinsurance transactions. 28歳にしてつかんだイングランドサッカー界でのチャンスを生かせるか。 チャンピオンシップ(英2部)の古豪ブラックバーンに電撃移籍した森下龍矢は意気込んでいる。 サガン鳥栖と名古屋グランパスでプレーし、2024年から海を渡ってレギア・ワルシャワで奮闘してきた森下は先日、大橋祐紀のチームメイトとなることが決まった。 日本ではSBが主戦場だった森下だが、昨季はポーランドで攻撃的なポジションにコンバートされ、ウィングやトップ下に前線と様々な役割をこなした。 すると、公式戦で14得点、14アシストとブレイク。 この飛躍に注目したブラックバーンに引き抜かれている。 Can he capitalize on his chance in English football, which he seized at the age of 28? Ryuya Morishita, having made a shocking move to Blackburn Rovers, a long-established club in the Championship (British second tier), is eager to make an impression. Having played for Sagan Tosu and Nagoya Grampus, and having been striving with Legia Warsaw since 2024, Morishita recently announced he would become teammates with Yuki Ohashi. For Morishita, his primary playing field in Japan was as a full-back, but he was converted to an attacking position in Poland last season, playing in various roles including wing-back and attacking midfielder. He then broke through, scoring 14 goals and providing 14 assists in official matches. The Blackburn club has been scouting for this promising player. > [!NOTE] > 📝 LFM2-350M-ENJP-MTは汎用的な英日翻訳において高い性能を発揮しますが、我々の主な目標は、コミュニティに力を与える柔軟な基盤モデルを提供することです。 > これは一流のタスク特化型翻訳システムを容易に構築できるよう設計された基盤です。 > > すべての基盤モデルと同様に、成長の余地があります。特に以下のような場面です: > - 極端に長い文脈や専門的/文脈依存の翻訳 > - 専門分野の言語(医療、法律、工学) > - 新しい固有名詞(新製品、ブランド、文化的参照) > - 業界・分野・企業特有のニュアンス(EC、金融、社内用語) > > これらはLiquid AIおよび開発者コミュニティによるファインチューニングで解決可能な課題です。 > 本モデルを最終到達点ではなく、実世界に即した多様な翻訳モデル群を生み出す触媒として位置付けています。 生成パラメータ: `temperature=0`を指定した貪欲デコード(greedy decoding)の使用を強く推奨します。 システムプロンプト: LFM2-ENJP-MTは以下のいずれかのシステムプロンプトを必須とします: 英語 → 日本語翻訳: `"Translate to Japanese."` 日本語 → 英語翻訳: `"Translate to English."` > [!WARNING] > ⚠️ これらのシステムプロンプトがなければモデルは意図通りに動作しません。 チャットテンプレート: LFM2-ENJP-MTは以下のようなChatML風テンプレートを使用します: Hugging FaceのTransformersの専用関数`.applychattemplate()`を使えば自動で適用できます。 - Hugging Face: LFM2-350M - llama.cpp: LFM2-350M-ENJP-MT-GGUF - LEAP: LEAP モデルライブラリ エッジ環境への導入を含むカスタムソリューションにご興味がある方は、営業チームまでお問い合わせください。

34
2

csgo-weapon-classification-ONNX

33
0

Phi-4-mini-instruct-ONNX-GQA

32
5

dinov3-vith16plus-pretrain-lvd1689m-ONNX

32
0

granite-embedding-30m-english-ONNX

NaNK
license:apache-2.0
32
0

C2S-Pythia-410m-cell-type-prediction-ONNX

31
0

Qwen2.5-1.5B

NaNK
30
4

Qwen3-14B-ONNX

NaNK
license:apache-2.0
30
0

phishing-email-detection-distilbert_v2.4.1-ONNX

NaNK
30
0

dinov3-vits16plus-pretrain-lvd1689m-ONNX

30
0

yolo26s-ONNX

28
0

dpt-dinov2-small-kitti

27
69

whisper-large-v3-ONNX

This is an ONNX version of openai/whisper-large-v3. It was automatically converted and uploaded using this space.

NaNK
27
5

Falcon-H1-Tiny-Multilingual-100M-Instruct-ONNX

27
0

e5-small-lora-ai-generated-detector-ONNX

27
0

Qwen2.5-Coder-1.5B-Instruct

NaNK
license:apache-2.0
26
4

yolo26m-ONNX

26
0

lite-whisper-large-v3-turbo-ONNX

license:apache-2.0
25
3

harrier-oss-v1-0.6b-ONNX

NaNK
license:mit
25
2

Phi-3.5-vision-instruct

25
2

dinov3-convnext-tiny-pretrain-lvd1689m-ONNX

25
0

Deep-Fake-Detector-v2-Model-ONNX

NaNK
license:apache-2.0
25
0

DepthPro-ONNX

24
14

dinov3-vitb16-pretrain-lvd1689m-ONNX

24
2

SmolLM2-360M-Instruct-ONNX

llama
24
1

siglip2-large-patch16-256-ONNX

NaNK
24
0

whisper-small-cv11-french-ONNX

24
0

dinov3-vitl16-pretrain-lvd1689m-ONNX

23
1

maskformer-resnet101-ade20k-full

23
0

metric3d-vit-small

22
2

whisper-large-v3-turbo-korean-ggml-ONNX

This is an ONNX version of royshilkrot/whisper-large-v3-turbo-korean-ggml. It was automatically converted and uploaded using this space.

22
0

granite-4.0-h-350m-ONNX

license:apache-2.0
21
1

yolo26l-ONNX

21
0

layoutlmv3-large-finetuned-funsd-ONNX

21
0

Florence-2-base

license:mit
20
12

Llama-3.2-1B-Instruct-onnx-web-gqa

NaNK
llama
20
3

Qwen2.5-0.5B-Instruct-ONNX

NaNK
20
2

TinyLlama-1.1B-Chat-v1.0-ONNX

NaNK
llama
20
2

yolo26x-ONNX

20
1

Qwen2-0.5B-Instruct-ONNX

NaNK
license:apache-2.0
20
1

granite-4.0-h-micro-ONNX

license:apache-2.0
20
0

opus-mt-en-fr

license:cc-by-4.0
20
0

MobileLLM-R1-140M-ONNX

llama4_text
20
0

yolov10s

license:agpl-3.0
19
8

EXAONE-3.5-2.4B-Instruct

NaNK
19
3

LFM2-2.6B-ONNX

LFM2 is a new generation of hybrid models developed by Liquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency. We're releasing the weights of four post-trained checkpoints with 350M, 700M, 1.2B, and 2.6 parameters. They provide the following key features to create AI-powered edge applications: Fast training & inference – LFM2 achieves 3x faster training compared to its previous generation. It also benefits from 2x faster decode and prefill speed on CPU compared to Qwen3. Best performance – LFM2 outperforms similarly-sized models across multiple benchmark categories, including knowledge, mathematics, instruction following, and multilingual capabilities. New architecture – LFM2 is a new hybrid Liquid model with multiplicative gates and short convolutions. Flexible deployment – LFM2 runs efficiently on CPU, GPU, and NPU hardware for flexible deployment on smartphones, laptops, or vehicles. Due to their small size, we recommend fine-tuning LFM2 models on narrow use cases to maximize performance. They are particularly suited for agentic tasks, data extraction, RAG, creative writing, and multi-turn conversations. However, we do not recommend using them for tasks that are knowledge-intensive or require programming skills. | Property | LFM2-350M | LFM2-700M | LFM2-1.2B | LFM2-2.6B | | ------------------- | ----------------------------- | ----------------------------- | ----------------------------- | ----------------------------- | | Parameters | 354,483,968 | 742,489,344 | 1,170,340,608 | 2,569,272,320 | | Layers | 16 (10 conv + 6 attn) | 16 (10 conv + 6 attn) | 16 (10 conv + 6 attn) | 30 (22 conv + 8 attn) | | Context length | 32,768 tokens | 32,768 tokens | 32,768 tokens | 32,768 tokens | | Vocabulary size | 65,536 | 65,536 | 65,536 | 65,536 | | Precision | bfloat16 | bfloat16 | bfloat16 | bfloat16 | | Training budget | 10 trillion tokens | 10 trillion tokens | 10 trillion tokens | 10 trillion tokens | | License | LFM Open License v1.0 | LFM Open License v1.0 | LFM Open License v1.0 | LFM Open License v1.0 Supported languages: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish. Generation parameters: We recommend the following parameters: `temperature=0.3` `minp=0.15` `repetitionpenalty=1.05` Reasoning: LFM2-2.6B is the only model in this family to use dynamic hybrid reasoning (traces between ` ` and ` ` tokens) for complex or multilingual prompts. Chat template: LFM2 uses a ChatML-like chat template as follows: You can automatically apply it using the dedicated `.applychattemplate()` function from Hugging Face transformers. Tool use: It consists of four main steps: 1. Function definition: LFM2 takes JSON function definitions as input (JSON objects between ` ` and ` ` special tokens), usually in the system prompt 2. Function call: LFM2 writes Pythonic function calls (a Python list between ` ` and ` ` special tokens), as the assistant answer. 3. Function execution: The function call is executed and the result is returned (string between ` ` and ` ` special tokens), as a "tool" role. 4. Final answer: LFM2 interprets the outcome of the function call to address the original user prompt in plain text. Here is a simple example of a conversation using tool use: Architecture: Hybrid model with multiplicative gates and short convolutions: 10 double-gated short-range LIV convolution blocks and 6 grouped query attention (GQA) blocks. Pre-training mixture: Approximately 75% English, 20% multilingual, and 5% code data sourced from the web and licensed materials. Training approach: Very large-scale SFT on 50% downstream tasks, 50% general domains Custom DPO with length normalization and semi-online datasets Iterative model merging If you haven't already, you can install the Transformers.js JavaScript library from NPM using: LFM2 outperforms similar-sized models across different evaluation categories. We only report scores using instruct variants and non-thinking modes for consistency. | Model | MMLU | GPQA | IFEval | IFBench | GSM8K | MGSM | MMMLU | | ---------------------- | ----- | ----- | ------ | ------- | ----- | ----- | ----- | | LFM2-2.6B | 64.42 | 26.57 | 79.56 | 22.19 | 82.41 | 74.32 | 55.39 | | Llama-3.2-3B-Instruct | 60.35 | 30.6 | 71.43 | 20.78 | 75.21 | 61.68 | 47.92 | | SmolLM3-3B | 59.84 | 26.31 | 72.44 | 17.93 | 81.12 | 68.72 | 50.02 | | gemma-3-4b-it | 58.35 | 29.51 | 76.85 | 23.53 | 89.92 | 87.28 | 50.14 | | Qwen3-4B-Instruct-2507 | 72.25 | 34.85 | 85.62 | 30.28 | 68.46 | 81.76 | 60.67 | If you are interested in custom solutions with edge deployment, please contact our sales team.

NaNK
19
3

LFM2-1.2B-Tool-ONNX

NaNK
19
3

whisper-medium_timestamped

license:mit
19
1

owlv2-base-patch16-finetuned-ONNX

This is an ONNX version of google/owlv2-base-patch16-finetuned. It was automatically converted and uploaded using this space.

19
0

lite-whisper-large-v3-turbo-fast-ONNX

license:apache-2.0
18
5

nanoLLaVA-1.5

NaNK
license:apache-2.0
18
4

metric3d-vit-giant2

18
3

paligemma2-3b-pt-224

NaNK
18
2

Falcon3-1B-Instruct

NaNK
llama
18
2

flan-t5-small-ONNX

18
0

nsfw-classifier-ONNX

This is an ONNX version of giacomoarienti/nsfw-classifier. It was automatically converted and uploaded using this space.

18
0

Minueza-2-96M-Instruct-Variant-10-ONNX

NaNK
llama
18
0

DialoGPT-small-ONNX

NaNK
license:mit
18
0

metric3d-vit-large

17
2

gpt2-ONNX

17
1

Phi-4-mini-instruct-ONNX-MHA

17
1

ast-finetuned-audioset-10-10-0.4593-ONNX

This is an ONNX version of MIT/ast-finetuned-audioset-10-10-0.4593. It was automatically converted and uploaded using this space.

NaNK
base_model:MIT/ast-finetuned-audioset-10-10-0.4593
17
1

llama-ai4privacy-multilingual-categorical-anonymiser-openpii-ONNX

base_model:ai4privacy/llama-ai4privacy-multilingual-categorical-anonymiser-openpii
17
1

MVANet-ONNX

license:mit
17
0

dinov3-convnext-small-pretrain-lvd1689m-ONNX

17
0

vaultgemma-1b-ONNX

[VaultGemma Technical Report][tech-report] [Responsible Generative AI Toolkit][rai-toolkit] [VaultGemma on Kaggle][kaggle-gemma] Summary description and brief definition of inputs and outputs. VaultGemma is a variant of the Gemma family of lightweight, state-of-the-art open models from Google. It is pre-trained from the ground up using Differential Privacy (DP). This provides strong, mathematically-backed privacy guarantees for its training data, limiting the extent to which the model's outputs can reveal information about any single training example. VaultGemma uses a similar architecture as Gemma 2. VaultGemma is a pretrained model that can be instruction tuned for a variety of language understanding and generation tasks. Its relatively small size (< 1B parameters) makes it possible to deploy in environments with limited resources, democratizing access to state-of-the-art AI models that are built with privacy at their core. - Input: - Text string, such as a question, a prompt, or a document to be summarized. - Total input context of 1K (1,024) tokens. - Output: - Generated text in response to the input, such as an answer to a question or a summary or categorization. Data used for model training and how the data was processed. The model was trained from scratch with differential privacy on a large-scale dataset of English-language text data from a variety of sources, including: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning and symbolic representation to address mathematical queries. The defining feature of this model is that the entire pre-training process was conducted using Differentially Private Stochastic Gradient Descent (DP-SGD) with a privacy budget of ε≤2.0, δ≤1.1e-10. DP-SGD provides a formal guarantee that the model's core knowledge base is itself private with respect to the individual examples in the training set. In addition to the inherent privacy protections of differential privacy, the following data cleaning and filtering methods used with Gemma 2 were applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. VaultGemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware TPUv6e. Training large language models with the significant computational overhead of differential privacy requires specialized hardware. TPUs are designed to handle the massive computations involved, offering the performance, memory, and scalability necessary to train models like VaultGemma efficiently and sustainably. Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. The core of the training implementation relied on specialized algorithms for privacy-preserving machine learning at scale: - [Differentially Private Stochastic Gradient Descent (DP-SGD)][dp-sgd]: The optimization algorithm used to train the model while providing formal privacy guarantees. - [Truncated Poisson Subsampling][poisson-subsampling]: A computationally efficient method used to enable large-scale DP training with fixed-size batches, which is critical for performance on modern accelerators. - [DP Scaling Laws][dp-scaling-laws]: The training configuration (model size, batch size, iterations) was determined by a novel set of scaling laws developed specifically for differentially private training, ensuring the optimal use of the compute and privacy budgets. The model was evaluated on a range of standard academic benchmarks. As expected, there is a utility trade-off for the strong privacy guarantees offered by the model. The table below shows the performance of the 1B pre-trained (PT) VaultGemma model. | Benchmark | n-shot | VaultGemma 1B PT | | :----------------------- | :-----------: | -------------------: | | [HellaSwag][hellaswag] | 10-shot | 39.09 | | [BoolQ][boolq] | 0-shot | 62.04 | | [PIQA][piqa] | 0-shot | 68.00 | | [SocialIQA][socialiqa] | 0-shot | 46.16 | | [TriviaQA][triviaqa] | 5-shot | 11.24 | | [ARC-c][arc] | 25-shot | 26.45 | | [ARC-e][arc] | 0-shot | 51.78 | We also conducted empirical tests to measure the model's "memorization rate"—its tendency to reproduce sequences from its training data. We followed the established methodology in the [Gemma 3 technical report][g3-tech-report]. The model was prompted with 50-token prefixes extracted from the training corpus to determine if it would generate the corresponding 50-token suffixes. The evaluation specifically tested for: - Exact Memorization: Verbatim reproduction of the suffix. - Approximate Memorization: Reproduction of the suffix with up to a 10% error rate. VaultGemma exhibited no detectable memorization (neither exact nor approximate) in these tests. This empirical finding strongly validates the effectiveness of the Differentially Private Stochastic Gradient Descent (DP-SGD) pre-training process in preventing the retention of individual training examples. We use the same data mixture as Gemma 2, and utilize differential privacy during the training process to ensure the model's parameters do not memorize individual training examples, providing a formal privacy guarantee for the training data. Further we are only providing a pre-trained model. These models have certain limitations that users should be aware of. VaultGemma is intended for a wide range of natural language processing (NLP) applications. The purpose of this list is to provide contextual information about possible use cases that the model creators considered. - Privacy-Preserving NLP Research: Serve as a strong baseline for researchers to experiment with privacy-preserving techniques, develop new algorithms, and fine-tune models on sensitive data. - Applications with Sensitive Data: Can be fine-tuned on private or sensitive datasets (e.g., in healthcare, finance) where it is critical that the base model itself does not carry risks from public pre-training data. - Content Creation and Communication: Generate creative text, power chatbots, and summarize documents in scenarios where data privacy is a primary concern. - Utility Gap for Privacy: There is an inherent trade-off between the strength of the privacy guarantee and model utility. As shown in the evaluation benchmarks, VaultGemma may underperform compared to non-private models of a similar size. - Training Data: The quality and diversity of the training data influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - Factual Accuracy: The model generates responses based on patterns from its training data but is not a knowledge base. It may generate incorrect or outdated factual statements. - Language Nuance: The model may struggle to grasp subtle nuances, sarcasm, or figurative language. The development of language models raises several ethical concerns. In creating this open model, we have carefully considered the following: - Bias and Fairness: Models trained on large-scale data can reflect socio-cultural biases from the training material. - Misinformation and Misuse: Models can be misused to generate text that is false, misleading, or harmful. Guidelines are provided for responsible use in the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: This model card summarizes details on the model's architecture, capabilities, limitations, and evaluation processes - Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - Privacy violations: Models were trained on data filtered for removal of certain personal information and other sensitive data. Further, we use differential privacy during pre-training, with ε≤2.0, δ≤1.1e-10. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. At the time of release to the best of our knowledge, this model is the largest and highest-performing open language model pretrained from the ground up with formal differential privacy. Its primary benefit is providing strong, mathematically-backed privacy guarantees for its training data, making it uniquely suited for applications and research where training data privacy is a critical concern. [model-page]: # "Link to VaultGemma Model Page" [tech-report]: https://services.google.com/fh/files/blogs/vaultgemmatechreport.pdf [rai-toolkit]: https://ai.google.dev/responsible [kaggle-gemma]: https://www.kaggle.com/models/google/vaultgemma [terms]: https://ai.google.dev/gemma/terms [safety-policies]: https://ai.google/static/documents/ai-responsibility-update-published-february-2025.pdf [prohibited-use]: https://ai.google.dev/gemma/prohibitedusepolicy [tpu]: https://cloud.google.com/tpu/docs/intro-to-tpu [jax]: https://github.com/jax-ml/jax [ml-pathways]: https://blog.google/technology/ai/introducing-pathways-next-generation-ai-architecture/ [dp-sgd]: https://arxiv.org/abs/1607.00133 [poisson-subsampling]: https://arxiv.org/abs/2411.04205 [dp-scaling-laws]: https://arxiv.org/pdf/2501.18914 [g3-tech-report]: https://arxiv.org/pdf/2503.19786 [hellaswag]: https://arxiv.org/abs/1905.07830 [boolq]: https://arxiv.org/abs/1905.10044 [piqa]: https://arxiv.org/abs/1911.11641 [socialiqa]: https://arxiv.org/abs/1904.09728 [triviaqa]: https://arxiv.org/abs/1705.03551 [arc]: https://arxiv.org/abs/1911.01547

NaNK
17
0

pythia-14m-ONNX

This is an ONNX version of EleutherAI/pythia-14m. It was automatically converted and uploaded using this space.

17
0

yolov10l

license:agpl-3.0
16
1

text_summarization-ONNX

This is an ONNX version of Falconsai/textsummarization. It was automatically converted and uploaded using this Hugging Face Space. See the pipeline documentation for `summarization`: https://huggingface.co/docs/transformers.js/api/pipelines#modulepipelines.SummarizationPipeline Model Card: Fine-Tuned T5 Small for Text Summarization The Fine-Tuned T5 Small is a variant of the T5 transformer model, designed for the task of text summarization. It is adapted and fine-tuned to generate concise and coherent summaries of input text. The model, named "t5-small," is pre-trained on a diverse corpus of text data, enabling it to capture essential information and generate meaningful summaries. Fine-tuning is conducted with careful attention to hyperparameter settings, including batch size and learning rate, to ensure optimal performance for text summarization. During the fine-tuning process, a batch size of 8 is chosen for efficient computation and learning. Additionally, a learning rate of 2e-5 is selected to balance convergence speed and model optimization. This approach guarantees not only rapid learning but also continuous refinement during training. The fine-tuning dataset consists of a variety of documents and their corresponding human-generated summaries. This diverse dataset allows the model to learn the art of creating summaries that capture the most important information while maintaining coherence and fluency. The goal of this meticulous training process is to equip the model with the ability to generate high-quality text summaries, making it valuable for a wide range of applications involving document summarization and content condensation. Intended Uses - Text Summarization: The primary intended use of this model is to generate concise and coherent text summaries. It is well-suited for applications that involve summarizing lengthy documents, news articles, and textual content. How to Use To use this model for text summarization, you can follow these steps: Limitations Specialized Task Fine-Tuning: While the model excels at text summarization, its performance may vary when applied to other natural language processing tasks. Users interested in employing this model for different tasks should explore fine-tuned versions available in the model hub for optimal results. Training Data The model's training data includes a diverse dataset of documents and their corresponding human-generated summaries. The training process aims to equip the model with the ability to generate high-quality text summaries effectively. Training Stats - Evaluation Loss: 0.012345678901234567 - Evaluation Rouge Score: 0.95 (F1) - Evaluation Runtime: 2.3456 - Evaluation Samples per Second: 1234.56 - Evaluation Steps per Second: 45.678 Responsible Usage It is essential to use this model responsibly and ethically, adhering to content guidelines and applicable regulations when implementing it in real-world applications, particularly those involving potentially sensitive content. References Hugging Face Model Hub T5 Paper Disclaimer: The model's performance may be influenced by the quality and representativeness of the data it was fine-tuned on. Users are encouraged to assess the model's suitability for their specific applications and datasets.

license:apache-2.0
16
0

ijepa_vith14_1k

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Image feature extraction with `onnx-community/ijepavith141k`. Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

16
0

Qwen2.5-Coder-0.5B-ONNX

Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

NaNK
16
0

whisper-small-ONNX

16
0

distilbert_finetuned_ai4privacy_v2-ONNX

This is an ONNX version of Isotonic/distilbertfinetunedai4privacyv2. It was automatically converted and uploaded using this space. If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

NaNK
16
0

OpenReasoning-Nemotron-1.5B-ONNX

NaNK
license:cc-by-4.0
16
0

whisper-tiny-ar-ONNX

This is an ONNX version of saralameri/whisper-tiny-ar. It was automatically converted and uploaded using this space.

16
0

granite-embedding-small-english-r2-ONNX

Model Summary: Granite-embedding-small-english-r2 is a 47M parameter dense biencoder embedding model from the Granite Embeddings collection that can be used to generate high quality text embeddings. This model produces embedding vectors of size 384 based on context length of upto 8192 tokens. Compared to most other open-source models, this model was only trained using open-source relevance-pair datasets with permissive, enterprise-friendly license, plus IBM collected and generated datasets. The r2 models show strong performance across standard and IBM-built information retrieval benchmarks (BEIR, ClapNQ), code retrieval (COIR), long-document search benchmarks (MLDR, LongEmbed), conversational multi-turn (MTRAG), table retrieval (NQTables, OTT-QA, AIT-QA, MultiHierTT, OpenWikiTables), and on many enterprise use cases. These models use a bi-encoder architecture to generate high-quality embeddings from text inputs such as queries, passages, and documents, enabling seamless comparison through cosine similarity. Built using retrieval oriented pretraining, contrastive finetuning, knowledge distillation, and model merging, granite-embedding-small-english-r2 is optimized to ensure strong alignment between query and passage embeddings. The latest granite embedding r2 release introduces two English embedding models, both based on the ModernBERT architecture: - granite-embedding-english-r2 (149M parameters): with an output embedding size of 768, replacing granite-embedding-125m-english. - granite-embedding-small-english-r2 (47M parameters): A first-of-its-kind reduced-size model, with 8192 context length support, fewer layers and a smaller output embedding size (384), replacing granite-embedding-30m-english. - Developed by: Granite Embedding Team, IBM - Repository: ibm-granite/granite-embedding-models - Paper: Granite Embedding R2 Models - Language(s): English - Release Date: Aug 15, 2025 - License: Apache 2.0 Intended Use: The model is designed to produce fixed length vector representations for a given text, which can be used for text similarity, retrieval, and search applications. This is a simple example of how to use the granite-embedding-small-english-r2 model with the Transformers.js library. If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Evaluation Results Granite embedding r2 models show a strong performance across tasks diverse tasks. Performance of the granite models on MTEB Retrieval (i.e., BEIR), MTEB-v2, code retrieval (CoIR), long-document search benchmarks (MLDR, LongEmbed), conversational multi-turn (MTRAG), table retrieval (NQTables, OTT-QA, AIT-QA, MultiHierTT, OpenWikiTables), benchmarks is reported in the below tables. The average speed to encode documents on a single H100 GPU using a sliding window with 512 context length chunks is also reported. Nearing encoding speed of 200 documents per second granite-embedding-small-english-r2 demonstrates speed and efficiency, while mainintaining competitive performance. | Model | Parameters (M) | Embedding Size | BEIR Retrieval (15) | MTEB-v2 (41)| CoIR (10) | MLDR (En) | MTRAG (4) | Encoding Speed (dosc/sec) | |------------------------------------|:--------------:|:--------------:|:-------------------:|:-----------:|:---------:|:---------:|:---------:|:-------------------------------:| | granite-embedding-125m-english | 125 | 768 | 52.3 | 62.1 | 50.3 | 35.0 | 49.4 | 149 | | granite-embedding-30m-english | 30 | 384 | 49.1 | 60.2 | 47.0 | 32.6 | 48.6 | 198 | | granite-embedding-english-r2 | 149 | 768 | 53.1 | 62.8 | 55.3 | 40.7 | 56.7 | 144 | | granite-embedding-small-english-r2 | 47 | 384 | 50.9 | 61.1 | 53.8 | 39.8 | 48.1 | 199 | |Model | Parameters (M)| Embedding Size|AVERAGE|MTEB-v2 Retrieval (10)| CoIR (10)| MLDR (En)| LongEmbed (6)| Table IR (5)| MTRAG (4) | Encoding Speed (docs/sec)| |-----------------------------------|:-------------:|:-------------:|:---------:|:--------------------:|:--------:|:--------:|:------------:|:-----------:|:--------:|-----------:| |e5-small-v2 |33|384|45.39|48.5|47.1|29.9|40.7|72.31|33.8| 138| |bge-small-en-v1.5 |33|384|45.22|53.9|45.8|31.4|32.1|69.91|38.2| 138| ||||||||||| |granite-embedding-english-r2 |149|768|59.5|56.4|54.8|41.6|67.8|78.53|57.6| 144| |granite-embedding-small-english-r2 | 47|384|55.6|53.9|53.4|40.1|61.9|75.51|48.9| 199| The latest granite embedding r2 release introduces two English embedding models, both based on the ModernBERT architecture: - granite-embedding-english-r2 (149M parameters): with an output embedding size of 768, replacing granite-embedding-125m-english. - granite-embedding-small-english-r2 (47M parameters): A first-of-its-kind reduced-size model, with fewer layers and a smaller output embedding size (384), replacing granite-embedding-30m-english. The following table shows the structure of the two models: | Model | granite-embedding-small-english-r2 | granite-embedding-english-r2 | | :--------- | :-------:|:--------:| | Embedding size | 384 | 768 | | Number of layers | 12 | 22 | | Number of attention heads | 12 | 12 | | Intermediate size | 1536 | 1152 | | Activation Function | GeGLU | GeGLU | | Vocabulary Size | 50368 | 50368 | | Max. Sequence Length | 8192 | 8192 | | # Parameters | 47M | 149M | The granite embedding r2 models incorporate key enhancements from the ModernBERT architecture, including: - Alternating attention lengths to accelerate processing - Rotary position embeddings for extended sequence length - A newly trained tokenizer optimized with code and text data - Flash Attention 2.0 for improved efficiency - Streamlined parameters, eliminating unnecessary bias terms Data Collection Granite embedding r2 models are trained using data from four key sources: 1. Unsupervised title-body paired data scraped from the web 2. Publicly available paired with permissive, enterprise-friendly license 3. IBM-internal paired data targetting specific technical domains 4. IBM-generated synthetic data Notably, we do not use the popular MS-MARCO retrieval dataset in our training corpus due to its non-commercial license (many open-source models use this dataset due to its high quality). The underlying encoder models using GneissWeb, an IBM-curated dataset composed exclusively of open, commercial-friendly sources. For governance, all our data undergoes a data clearance process subject to technical, business, and governance review. This comprehensive process captures critical information about the data, including but not limited to their content description ownership, intended use, data classification, licensing information, usage restrictions, how the data will be acquired, as well as an assessment of sensitive information (i.e, personal information). Infrastructure We trained the granite embedding english r2 models using IBM's computing cluster, BlueVela Cluster, which is outfitted with NVIDIA H100 80GB GPUs. This cluster provides a scalable and efficient infrastructure for training our models over multiple GPUs. Ethical Considerations and Limitations Granite-embedding-small-english-r2 leverages both permissively licensed open-source and select proprietary data for enhanced performance. The training data for the base language model was filtered to remove text containing hate, abuse, and profanity. Granite-embedding-small-english-r2 is trained only for English texts, and has a context length of 8192 tokens (longer texts will be truncated to this size). - ⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite - 📄 Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/ - 💡 Learn about the latest Granite learning resources: https://ibm.biz/granite-learning-resources

NaNK
license:apache-2.0
16
0

NuNER_Zero

15
1

multilingual-sentiment-analysis-ONNX

15
1

owlv2-base-patch16-ensemble-ONNX

15
1

trocr-base-printed-ONNX

15
0

dfine_m_obj2coco-ONNX

15
0

nsfw_image_detection-ONNX

15
0

whisper-medium-fr-ONNX

15
0

Lucy-ONNX

license:apache-2.0
15
0

TinyBERT_General_4L_312D-ONNX

15
0

owlv2-large-patch14-ensemble-ONNX

This is an ONNX version of google/owlv2-large-patch14-ensemble. It was automatically converted and uploaded using this space.

15
0

english-TTS0V1-ONNX

NaNK
15
0

LFM2-2.6B-Exp-ONNX

NaNK
14
4

Llama-3.2-3B

NaNK
llama
14
4

ettin-encoder-32m-ONNX

license:mit
14
0

BiRefNet-HRSOD_DHU-ONNX

license:mit
14
0

NeoBERT-ONNX

NeoBERT is a next-generation encoder model for English text representation, pre-trained from scratch on the RefinedWeb dataset. NeoBERT integrates state-of-the-art advancements in architecture, modern data, and optimized pre-training methodologies. It is designed for seamless adoption: it serves as a plug-and-play replacement for existing base models, relies on an optimal depth-to-width ratio, and leverages an extended context length of 4,096 tokens. Despite its compact 250M parameter footprint, it is the most efficient model of its kind and achieves state-of-the-art results on the massive MTEB benchmark, outperforming BERT large, RoBERTa large, NomicBERT, and ModernBERT under identical fine-tuning conditions. If you haven't already, you can install the Transformers.js JavaScript library from NPM using: You can then compute embeddings using the pipeline API:

NaNK
license:mit
14
0

bert_uncased_L-2_H-128_A-2-ONNX

NaNK
14
0

xlm-roberta-base-finetuned-squad2-ONNX

This is an ONNX version of IProject-10/xlm-roberta-base-finetuned-squad2. It was automatically converted and uploaded using this space.

NaNK
14
0

englishtts-ONNX

This is an ONNX version of devhem/englishtts. It was automatically converted and uploaded using this space.

14
0

nougat-latex-base-ONNX

This is an ONNX version of Norm/nougat-latex-base. It was automatically converted and uploaded using this Hugging Face Space. See the pipeline documentation for `image-to-text`: https://huggingface.co/docs/transformers.js/api/pipelines#modulepipelines.ImageToTextPipeline - Model type: Donut - Finetuned from: facebook/nougat-base - Repository: source code Nougat-LaTeX-based is fine-tuned from facebook/nougat-base with im2latex-100k to boost its proficiency in generating LaTeX code from images. Since the initial encoder input image size of nougat was unsuitable for equation image segments, leading to potential rescaling artifacts that degrades the generation quality of LaTeX code. To address this, Nougat-LaTeX-based adjusts the input resolution and uses an adaptive padding approach to ensure that equation image segments in the wild are resized to closely match the resolution of the training data. Evaluation Evaluated on an image-equation pair dataset collected from Wikipedia, arXiv, and im2latex-100k, curated by lukas-blecher |model| tokenacc ↑ | normed edit distance ↓ | | --- | --- | --- | |pix2tex| 0.5346 | 0.10312 |pix2tex|0.60|0.10| |nougat-latex-based| 0.623850 | 0.06180 | pix2tex is a ResNet + ViT + Text Decoder architecture introduced in LaTeX-OCR. pix2tex: reported from LaTeX-OCR; pix2tex: my evaluation with the released checkpoint ; nougat-latex-based: evaluated on results generated with beam-search strategy. > The inference API widget sometimes cuts the response short. Please check this issue for more details. You may want to run the model yourself in case the inference API bug cuts the results short. 1. Download the repo

license:apache-2.0
14
0

wav2vec2-base-10k-voxpopuli-ft-pl-ONNX

NaNK
license:cc-by-nc-4.0
14
0

trocr-base-plate-number-ONNX

This is an ONNX version of ristek-dsa/trocr-base-plate-number. It was automatically converted and uploaded using this Hugging Face Space. See the pipeline documentation for `image-to-text`: https://huggingface.co/docs/transformers.js/api/pipelines#modulepipelines.ImageToTextPipeline --- license: apache-2.0 pipelinetag: image-to-text tags: - vision widget: - src: https://huggingface.co/datasets/ristek-dsa/sample-images/resolve/main/car-plate-1.png exampletitle: A1651VV - src: https://huggingface.co/datasets/ristek-dsa/sample-images/resolve/main/car-plate-2.png exampletitle: B8857GS ---

14
0

orpheus-3b-0.1-ft-ONNX

NaNK
llama
13
7

gliner_large-v2.1

NaNK
13
2

deberta-small-long-nli

13
2

distilbert-NER-ONNX

This is an ONNX version of dslim/distilbert-NER. It was automatically converted and uploaded using this space. If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

13
2

all-MiniLM-L6-v2-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: You can then use the model to compute embeddings like this: You can convert this Tensor to a nested JavaScript array using `.tolist()`:

NaNK
license:apache-2.0
13
2

whisper-small.en_timestamped

13
1

BiRefNet-portrait-ONNX

license:mit
13
0

Qwen2.5-Math-1.5B-Instruct

NaNK
13
0

10-animals-classification-ONNX

This is an ONNX version of AliGhiasvand86/10-animals-classification. It was automatically converted and uploaded using this space.

13
0

wav2vec2-base-960h-ONNX

This is an ONNX version of facebook/wav2vec2-base-960h. It was automatically converted and uploaded using this space.

13
0

whisper-podlodka-turbo-ONNX

This is an ONNX version of bond005/whisper-podlodka-turbo. It was automatically converted and uploaded using this space.

13
0

bert-base-chinese-ONNX

This is an ONNX version of google-bert/bert-base-chinese. It was automatically converted and uploaded using this space.

13
0

ukr-emotions-classifier-ONNX

This is an ONNX version of ukr-detect/ukr-emotions-classifier. It was automatically converted and uploaded using this space.

13
0

RexBERT-mini-ONNX

This is an ONNX version of thebajajra/RexBERT-mini. It was automatically converted and uploaded using this space.

13
0

Youtu-LLM-2B-ONNX

NaNK
12
2

Speech-Emotion-Classification-ONNX

12
2

yolov10b

NaNK
license:agpl-3.0
12
1

emotion-english-distilroberta-base-ONNX

12
1

BiRefNet_512x512-ONNX

NaNK
license:mit
12
1

gpt2-medium-ONNX

12
1

rfdetr_medium-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Perform object-detection with `onnx-community/rfdetrmedium-ONNX`.

license:apache-2.0
12
1

Falcon-H1-Tiny-Coder-90M-ONNX

NaNK
12
0

BiRefNet-DIS5K-TR_TEs-ONNX

license:mit
12
0

dpt-dinov2-small-nyu

12
0

MobileLLM-1B

NaNK
12
0

whisper-large-v3-turbo-german-ONNX

12
0

ModernCE-base-sts-ONNX

12
0

vitpose-plus-base-ONNX

This is an ONNX version of usyd-community/vitpose-plus-base. It was automatically converted and uploaded using this space.

12
0

sbert_large_nlu_ru-ONNX

This is an ONNX version of KseniyaZ/sbertlargenluru. It was automatically converted and uploaded using this space.

12
0

mdbr-leaf-mt-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: You can then use the model to compute embeddings like this: Query: What is machine learning? Similarity: 0.9063 | Document 0: Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data. Similarity: 0.7287 | Document 1: Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors. Query: How does neural network training work? Similarity: 0.6725 | Document 0: Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data. Similarity: 0.8287 | Document 1: Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors. ```

license:apache-2.0
12
0

tiny-starcoder-instruct-ONNX

12
0

stsb-xlm-r-greek-transfer-ONNX

Semantic Textual Similarity for the Greek language using Transformers and Transfer Learning By the Hellenic Army Academy (SSE) and the Technical University of Crete (TUC) > The model was manually converted to ONNX format. The original model is available here. This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. We follow a Teacher-Student transfer learning approach described here to train an XLM-Roberta-base model on STS using parallel EN-EL sentence pairs. Using this model becomes easy when you have sentence-transformers installed: Usage (HuggingFace Transformers) Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings. Similarity Evaluation on STS.en-el.txt (translated manually for evaluation purposes) We measure the semantic textual similarity (STS) between sentence pairs in different languages: | cosinepearson | cosinespearman | euclideanpearson | euclideanspearman | manhattanpearson | manhattanspearman | dotpearson | dotspearman | | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | 0.834474802920369 | 0.845687403828107 | 0.815895882192263 | 0.81084300966291 | 0.816333562677654 | 0.813879742416394 | 0.7945167996031 | 0.802604238383742 | Translation We measure the translation accuracy. Given a list with source sentences, for example, 1000 English sentences. And a list with matching target (translated) sentences, for example, 1000 Greek sentences. For each sentence pair, we check if their embeddings are the closest using cosine similarity. I.e., for each srcsentences[i] we check if trgsentences[i] has the highest similarity out of all target sentences. If this is the case, we have a hit, otherwise an error. This evaluator reports accuracy (higher = better). | src2trg | trg2src | | ----------- | ----------- | | 0.981 | 0.9775 | Training The model was trained with the parameters: `torch.utils.data.dataloader.DataLoader` of length 135121 with parameters: Acknowledgement The research work was supported by the Hellenic Foundation for Research and Innovation (HFRI) under the HFRI PhD Fellowship grant (Fellowship Number:50, 2nd call) Citing & Authors Citation info for Greek model: TBD Based on the transfer learning approach of Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

license:apache-2.0
12
0

nougat-small-ONNX

This is an ONNX version of facebook/nougat-small. It was automatically converted and uploaded using this Hugging Face Space. See the pipeline documentation for `image-to-text`: https://huggingface.co/docs/transformers.js/api/pipelines#modulepipelines.ImageToTextPipeline Nougat model trained on PDF-to-markdown. It was introduced in the paper Nougat: Neural Optical Understanding for Academic Documents by Blecher et al. and first released in this repository. Disclaimer: The team releasing Nougat did not write a model card for this model so this model card has been written by the Hugging Face team. Note: this model corresponds to the "0.1.0-small" version of the original repository. Nougat is a Donut model trained to transcribe scientific PDFs into an easy-to-use markdown format. The model consists of a Swin Transformer as vision encoder, and an mBART model as text decoder. The model is trained to autoregressively predict the markdown given only the pixels of the PDF image as input. Nougat high-level overview. Taken from the original paper . You can use the raw model for transcribing a PDF into Markdown. See the model hub to look for other fine-tuned versions that may interest you.

NaNK
license:cc-by-4.0
12
0

wespeaker-voxceleb-resnet34-LM

license:cc-by-4.0
11
2

LFM2-1.2B-Extract-ONNX

Based on LFM2-1.2B, LFM2-1.2B-Extract is designed to extract important information from a wide variety of unstructured documents (such as articles, transcripts, or reports) into structured outputs like JSON, XML, or YAML. - Extracting invoice details from emails into structured JSON. - Converting regulatory filings into XML for compliance systems. - Transforming customer support tickets into YAML for analytics pipelines. - Populating knowledge graphs with entities and attributes from unstructured reports. You can find more information about other task-specific models in this blog post. Generation parameters: We strongly recommend using greedy decoding with a `temperature=0`. System prompt: If no system prompt is provided, the model will default to JSON outputs. We recommend providing a system prompt with a specific format (JSON, XML, or YAML) and a given schema to improve accuracy (see the following example). Supported languages: English, Arabic, Chinese, French, German, Japanese, Korean, Portuguese, and Spanish. Chat template: LFM2 uses a ChatML-like chat template as follows: You can automatically apply it using the dedicated `.applychattemplate()` function from Hugging Face transformers. > [!WARNING] > ⚠️ The model is intended for single-turn conversations. The data used for training these models was primarily synthetic, which allowed us to ensure a diverse data mix. We used a range of document types, domains, styles, lengths, and languages. We also varied the density and distribution of relevant text in the documents. In some cases, the extracted information was clustered in one part of the document; in others, it’s spread throughout. We applied the same approach of ensuring diversity when creating synthetic user requests and designing the structure of the model outputs. The data generation process underwent many iterations, incorporating ideas and feedback from across the Liquid AI team. We evaluated LFM2-Extract on a dataset of 5,000 documents, covering over 100 topics with a mix of writing styles, ambiguities, and formats. We used a combination of five metrics to capture a balanced view on syntax, accuracy, and faithfulness: - Syntax score: Checks whether outputs parse cleanly as valid JSON, XML, or YAML. - Format accuracy: Verifies that outputs match the requested format (e.g., JSON when JSON is requested). - Keyword faithfulness: Measures whether values in the structured output actually appear in the input text. - Absolute scoring: A judge LLM scores quality on a 1-5 scale, assessing completeness and correctness of extractions. - Relative scoring: We ask a judge LLM to choose the best answer between the extraction model’s output and the ground-truth answer. LFM2-1.2B-Extract can output complex objects in different languages on a level higher than Gemma 3 27B, a model 22.5 times its size. - Hugging Face: LFM2-1.2B - llama.cpp: LFM2-1.2B-Extract-GGUF - LEAP: LEAP model library If you are interested in custom solutions with edge deployment, please contact our sales team.

NaNK
11
2

deberta-base-long-nli

11
1

Olmo-3-7B-Instruct-ONNX

NaNK
license:apache-2.0
11
0

Baguettotron-ONNX

llama
11
0

vit-face-expression-ONNX

11
0

rtdetr_v2_r18vd-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Perform object-detection with `onnx-community/rtdetrv2r18vd-ONNX`.

license:apache-2.0
11
0

bert-tiny-finetuned-sms-spam-detection-ONNX

This is an ONNX version of mrm8488/bert-tiny-finetuned-sms-spam-detection. It was automatically converted and uploaded using this space. If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Classify SMS messages as spam or not spam.

11
0

whisper-tiny-ONNX

This is an ONNX version of openai/whisper-tiny. It was automatically converted and uploaded using this space.

11
0

owlv2-base-patch16-ONNX

NaNK
11
0

grammar_error_correcter_v1-ONNX

NaNK
11
0

bart-large-mnli-ONNX

11
0

gpt2-mini-ONNX

This is an ONNX version of erwanf/gpt2-mini. It was automatically converted and uploaded using this space.

11
0

mgp-str-base

10
2

rtdetr_v2_r34vd-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Perform object-detection with `onnx-community/rtdetrv2r34vd-ONNX`.

license:apache-2.0
10
1

canary-qwen-2.5b-ONNX

NaNK
10
1

scibert_scivocab_uncased-ONNX

10
0

BiRefNet-COD-ONNX

license:mit
10
0

maskformer-swin-base-ade

10
0

mobilenetv4s-webnn

license:apache-2.0
10
0

xlm-roberta-large-finetuned-conll03-english-ONNX

This is an ONNX version of FacebookAI/xlm-roberta-large-finetuned-conll03-english. It was automatically converted and uploaded using this space. If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

10
0

mobilebert-uncased-ONNX

10
0

CLIP-ViT-L-14-DataComp.XL-s13B-b90K-ONNX

This is an ONNX version of laion/CLIP-ViT-L-14-DataComp.XL-s13B-b90K. It was automatically converted and uploaded using this space.

NaNK
10
0

musical-instruments-ONNX

This is an ONNX version of larynx1982/musical-instruments. It was automatically converted and uploaded using this space.

10
0

multilingual-IPTC-news-topic-classifier-ONNX

This is an ONNX version of classla/multilingual-IPTC-news-topic-classifier. It was automatically converted and uploaded using this space. If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Text classification with a multilingual news topic classifier.

10
0

dinov3-convnext-base-pretrain-lvd1689m-ONNX

10
0

dinov3-vits16-pretrain-lvd1689m-ONNX-MHA

10
0

Bitnet-SmolLM-135M-ONNX

This is an ONNX version of ighoshsubho/Bitnet-SmolLM-135M. It was automatically converted and uploaded using this space.

llama
10
0

gpt2-alpaca-gpt4-ONNX

This is an ONNX version of vicgalle/gpt2-alpaca-gpt4. It was automatically converted and uploaded using this space.

NaNK
10
0

tiny-random-gpt2-ONNX

NaNK
10
0

deberta-v3-large-zeroshot-v2.0-ONNX

NaNK
10
0

layoutlmv3-finetuned-invoice-ONNX

This is an ONNX version of ronak1998/layoutlmv3-finetuned-invoice. It was automatically converted and uploaded using this space.

10
0

siglip2-so400m-patch14-384-ONNX

NaNK
9
2

AFM-4.5B-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

NaNK
license:apache-2.0
9
2

LFM2-350M-Extract-ONNX

Based on LFM2-350M, LFM2-350M-Extract is designed to extract important information from a wide variety of unstructured documents (such as articles, transcripts, or reports) into structured outputs like JSON, XML, or YAML. - Extracting invoice details from emails into structured JSON. - Converting regulatory filings into XML for compliance systems. - Transforming customer support tickets into YAML for analytics pipelines. - Populating knowledge graphs with entities and attributes from unstructured reports. You can find more information about other task-specific models in this blog post. Generation parameters: We strongly recommend using greedy decoding with a `temperature=0`. System prompt: If no system prompt is provided, the model will default to JSON outputs. We recommend providing a system prompt with a specific format (JSON, XML, or YAML) and a given schema to improve accuracy (see the following example). Supported languages: English, Arabic, Chinese, French, German, Japanese, Korean, Portuguese, and Spanish. Chat template: LFM2 uses a ChatML-like chat template as follows: You can automatically apply it using the dedicated `.applychattemplate()` function from Hugging Face transformers. > [!WARNING] > ⚠️ The model is intended for single-turn conversations. The data used for training these models was primarily synthetic, which allowed us to ensure a diverse data mix. We used a range of document types, domains, styles, lengths, and languages. We also varied the density and distribution of relevant text in the documents. In some cases, the extracted information was clustered in one part of the document; in others, it’s spread throughout. We applied the same approach of ensuring diversity when creating synthetic user requests and designing the structure of the model outputs. The data generation process underwent many iterations, incorporating ideas and feedback from across the Liquid AI team. We evaluated LFM2-Extract on a dataset of 5,000 documents, covering over 100 topics with a mix of writing styles, ambiguities, and formats. We used a combination of five metrics to capture a balanced view on syntax, accuracy, and faithfulness: - Syntax score: Checks whether outputs parse cleanly as valid JSON, XML, or YAML. - Format accuracy: Verifies that outputs match the requested format (e.g., JSON when JSON is requested). - Keyword faithfulness: Measures whether values in the structured output actually appear in the input text. - Absolute scoring: A judge LLM scores quality on a 1-5 scale, assessing completeness and correctness of extractions. - Relative scoring: We ask a judge LLM to choose the best answer between the extraction model’s output and the ground-truth answer. LFM2-350M-Extract outperforms Gemma 3 4B at this task, a model more than 11x its size. - Hugging Face: LFM2-350M - llama.cpp: LFM2-350M-Extract-GGUF - LEAP: LEAP model library If you are interested in custom solutions with edge deployment, please contact our sales team.

9
2

maskformer-swin-tiny-ade

9
1

granite-3.0-2b-instruct

NaNK
9
1

resnet-50-ONNX

This is an ONNX version of microsoft/resnet-50. It was automatically converted and uploaded using this space.

NaNK
9
1

Lucy-128k-ONNX

license:apache-2.0
9
1

t5-large-ONNX

9
1

maskformer-swin-tiny-coco

9
0

Qwen2.5-Math-1.5B

NaNK
9
0

lite-whisper-large-v3-ONNX

NaNK
license:apache-2.0
9
0

DAC.speech.v1.0-1.5kbps

Audio decoder of https://huggingface.co/ibm-research/DAC.speech.v1.0, converted to ONNX to be compatible with Transformers.js

NaNK
9
0

fairface_gender_image_detection-ONNX

This is an ONNX version of dima806/fairfacegenderimagedetection. It was automatically converted and uploaded using this space.

9
0

vit-base-violence-detection-ONNX

9
0

TinyLlama_v1.1-ONNX

This is an ONNX version of TinyLlama/TinyLlamav1.1. It was automatically converted and uploaded using this space.

NaNK
llama
9
0

SmolLM2-135M-humanized-ONNX

llama
9
0

xlm-roberta-base-ONNX

This is an ONNX version of FacebookAI/xlm-roberta-base. It was automatically converted and uploaded using this space.

9
0

PhoWhisper-base-ONNX

This is an ONNX version of vinai/PhoWhisper-base. It was automatically converted and uploaded using this space.

9
0

gpt2-open-instruct-v1-ONNX

This is an ONNX version of vicgalle/gpt2-open-instruct-v1. It was automatically converted and uploaded using this space.

NaNK
9
0

mobilenetv3_small_100.lamb_in1k

NaNK
8
1

rtdetr_r50vd

8
1

opus-mt-zh-en

license:cc-by-4.0
8
1

roberta_toxicity_classifier-ONNX

8
1

arabic-ner-ONNX

8
1

gliner_small-v2

NaNK
8
0

BiRefNet-DIS5K-ONNX

license:mit
8
0

opus-mt-mul-en

license:cc-by-4.0
8
0

dpt-dinov2-base-kitti

8
0

MobileLLM-600M

8
0

siglip2-so400m-patch16-256-ONNX

NaNK
8
0

rtdetr_v2_r101vd-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Perform object-detection with `onnx-community/rtdetrv2r101vd-ONNX`.

license:apache-2.0
8
0

dfine_n_coco-ONNX

8
0

flan-t5-base-ONNX

8
0

distilbert-base-cased-finetuned-conll03-english-ONNX

distilbert-base-cased-finetuned-conll03-english (ONNX) This is an ONNX version of elastic/distilbert-base-cased-finetuned-conll03-english. It was automatically converted and uploaded using this space. If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

8
0

Qwen1.5-0.5B-Chat-ONNX

This is an ONNX version of Qwen/Qwen1.5-0.5B-Chat. It was automatically converted and uploaded using this space.

NaNK
8
0

bert-large-cased-ONNX

8
0

Qwen2-0.5B-ONNX

NaNK
8
0

gpt2-large-ONNX

8
0

vit-base-patch16-224-ONNX

This is an ONNX version of google/vit-base-patch16-224. It was automatically converted and uploaded using this space.

NaNK
8
0

DialoGPT-small-player_03-ONNX

NaNK
8
0

siglip2-base-patch16-naflex-ONNX

8
0

mms-1b-all-ONNX

This is an ONNX version of facebook/mms-1b-all. It was automatically converted and uploaded using this space.

NaNK
8
0

MobileLLM-R1-360M-ONNX

llama4_text
8
0

mdbr-leaf-ir-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: You can then use the model to compute embeddings like this: Query: What is machine learning? Similarity: 0.6857 | Document 0: Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data. Similarity: 0.4598 | Document 1: Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors. Query: How does neural network training work? Similarity: 0.4238 | Document 0: Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data. Similarity: 0.5723 | Document 1: Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors. ```

license:apache-2.0
8
0

mobilenet_v2_1.4_224-ONNX

This is an ONNX version of google/mobilenetv21.4224. It was automatically converted and uploaded using this space.

NaNK
8
0

gpt2-alpaca-ONNX

8
0

deberta-v3-large-squad2-ONNX

This is an ONNX version of sjrhuschlee/deberta-v3-large-squad2. It was automatically converted and uploaded using this space.

NaNK
8
0

sapiens-seg-0.3b

NaNK
7
3

mediapipe_selfie_segmentation_landscape

NaNK
license:apache-2.0
7
3

MobileLLM-R1-950M-ONNX

llama4_text
7
3

Llama-Guard-3-1B

NaNK
llama
7
2

MobileLLM-125M

7
2

Phi-4-mini-instruct-web-q4f16

7
2

LFM2-1.2B-RAG-ONNX

NaNK
7
2

Apertus-8B-Instruct-2509-ONNX

NaNK
license:apache-2.0
7
1

twitter-xlm-roberta-base-sentiment-ONNX

7
1

grammar-synthesis-small-ONNX

7
1

bert-base-multilingual-uncased-ONNX

Pretrained model on the top 102 languages with the largest Wikipedia using a masked language modeling (MLM) objective. It was introduced in this paper and first released in this repository. This model is uncased: it does not make a difference between english and English. Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by the Hugging Face team. BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it was pretrained with two objectives: - Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run the entire masked sentence through the model and has to predict the masked words. This is different from traditional recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the sentence. - Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to predict if the two sentences were following each other or not. This way, the model learns an inner representation of the languages in the training set that can then be used to extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard classifier using the features produced by the BERT model as inputs. You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task. See the model hub to look for fine-tuned versions on a task that interests you. Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering. For tasks such as text generation you should look at model like GPT2. If you haven't already, you can install the Transformers.js JavaScript library from NPM using: You can then use this model directly with a pipeline for masked language modeling: Even if the training data used for this model could be characterized as fairly neutral, this model can have biased predictions. This bias will also affect all fine-tuned versions of this model. The BERT model was pretrained on the 102 languages with the largest Wikipedias. You can find the complete list here.

license:apache-2.0
7
0

decision-transformer-gym-walker2d-medium

7
0

mobilenetv3_small_075.lamb_in1k

NaNK
7
0

opus-mt-en-de

license:cc-by-4.0
7
0

jais-family-590m-chat

7
0

dpt-dinov2-base-nyu

7
0

dpt-dinov2-large-nyu

7
0

dpt-dinov2-large-kitti

7
0

maskformer-swin-large-ade

7
0

Qwen2.5-Coder-1.5B

NaNK
7
0

OLMo-1B-hf

NaNK
7
0

distilgpt2-ONNX

7
0

Pleias-Nano

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Text generation with `onnx-community/Pleias-Nano`. Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

llama
7
0

paligemma2-3b-pt-448

NaNK
7
0

siglip2-giant-opt-patch16-384-ONNX

NaNK
7
0

Phi-3.5-mini-instruct-ONNX-GQA

Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

7
0

rtdetr_v2_r50vd-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Perform object-detection with `onnx-community/rtdetrv2r50vd-ONNX`.

license:apache-2.0
7
0

wav2vec2-base-Speech_Emotion_Recognition-ONNX

This is an ONNX version of DunnBC22/wav2vec2-base-SpeechEmotionRecognition. It was automatically converted and uploaded using this space.

7
0

owlv2-large-patch14-finetuned-ONNX

This is an ONNX version of google/owlv2-large-patch14-finetuned. It was automatically converted and uploaded using this space.

7
0

roberta-base-squad2-ONNX

NaNK
7
0

rubert-base-cased-ONNX

7
0

vitpose-plus-small-ONNX

7
0

mbert-ONNX

7
0

nb-llama-3.2-1B-ONNX

This is an ONNX version of NbAiLab/nb-llama-3.2-1B. It was automatically converted and uploaded using this space.

NaNK
llama
7
0

BalloonDetectioDTR-ONNX

This is an ONNX version of nicky007/BalloonDetectioDTR. It was automatically converted and uploaded using this space.

7
0

ettin-decoder-32m-ONNX

license:mit
7
0

llama-3.2-1b-medical-notes-ONNX

NaNK
llama
7
0

whisper-small-ita-ONNX

7
0

bge-base-en-v1.5-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: You can then use the model to compute embeddings, as follows: You can also use the model for retrieval. For example:

NaNK
license:mit
7
0

bert-mini-ONNX

This is an ONNX version of prajjwal1/bert-mini. It was automatically converted and uploaded using this space.

7
0

layoutlmv3-large-ONNX

This is an ONNX version of microsoft/layoutlmv3-large. It was automatically converted and uploaded using this space.

NaNK
7
0

bert-multilingual-toxicity-classifier-ONNX

This is an ONNX version of textdetox/bert-multilingual-toxicity-classifier. It was automatically converted and uploaded using this space.

7
0

whisper-small-cantonese-ONNX

7
0

trocr-base-stage1-ONNX

This is an ONNX version of microsoft/trocr-base-stage1. It was automatically converted and uploaded using this space.

NaNK
7
0

rugpt3large_based_on_gpt2-ONNX

This is an ONNX version of ai-forever/rugpt3largebasedongpt2. It was automatically converted and uploaded using this space.

NaNK
7
0

whisper-base.en-ONNX

This is an ONNX version of openai/whisper-base.en. It was automatically converted and uploaded using this space.

7
0

RuModernBERT-base-ONNX

This is an ONNX version of deepvk/RuModernBERT-base. It was automatically converted and uploaded using this space.

7
0

bert-finetuned-phishing-ONNX

7
0

Biggie-SmoLlm-0.4B-ONNX

This is an ONNX version of nisten/Biggie-SmoLlm-0.4B. It was automatically converted and uploaded using this space.

NaNK
llama
7
0

code-autocomplete-gpt2-base-ONNX

This is an ONNX version of shibing624/code-autocomplete-gpt2-base. It was automatically converted and uploaded using this space.

7
0

tiny_starcoder_py-ONNX

7
0

OmniParser-v2.0_icon_caption

license:mit
6
3

siglip2-so400m-patch16-512-ONNX

NaNK
6
3

ModernBERT-base-nli-ONNX

This is an ONNX version of tasksource/ModernBERT-base-nli. It was automatically converted and uploaded using this space. If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

6
3

ERNIE-4.5-0.3B-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

NaNK
6
2

Pleias-Pico

llama
6
2

t5-base-ONNX

6
2

Musical-genres-Classification-Hubert-V1-ONNX

NaNK
6
2

NeuroBERT-NER-ONNX

This is an ONNX version of boltuix/NeuroBERT-NER. It was automatically converted and uploaded using this space. If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

6
2

CrisperWhisper-ONNX

6
2

sapiens-seg-0.6b

NaNK
6
1

dinov2-small

6
1

moondream2.text_model-ONNX

6
1

Arch-Function-1.5B

NaNK
6
1

Phi-3-vision-128k-instruct

6
1

helium-1-preview-2b-ONNX

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Text-generation w/ `onnx-community/helium-1-preview-2b-ONNX` Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

NaNK
license:cc-by-4.0
6
1

SmolLM2-135M-Instruct-ONNX-GQA

llama
6
1

Dia-1.6B-0626-ONNX

NaNK
6
1

ettin-decoder-17m-ONNX

license:mit
6
1

TinyCLIP-ViT-39M-16-Text-19M-YFCC15M-ONNX

6
1

trlm-135m-ONNX

This is an ONNX version of Shekswess/trlm-135m. It was automatically converted and uploaded using this space.

llama
6
1

decision-transformer-gym-halfcheetah-expert

6
0

mobilenetv4_conv_medium.e500_r224_in1k

NaNK
6
0

mobilenet_v2_1.4_224

NaNK
6
0

mobilenetv3_large_100.miil_in21k_ft_in1k

NaNK
6
0

rtdetr_r18vd

6
0

opus-mt-en-zh

license:cc-by-4.0
6
0

hiera-huge-224-hf

6
0

AMD-OLMo-1B-SFT-DPO

NaNK
6
0

dinov2-with-registers-giant

6
0

dinov2-with-registers-giant-imagenet1k-1-layer

6
0

siglip2-large-patch16-512-ONNX

NaNK
6
0

Llasa-1B-ONNX

NaNK
llama
6
0

dfine_l_coco-ONNX

6
0

clip-vit-base-patch16-ONNX

NaNK
6
0

N1-ONNX

This is an ONNX version of GoofyLM/N1. It was automatically converted and uploaded using this space.

NaNK
llama
6
0

whisper-small-tonga-ONNX

6
0

ettin-decoder-150m-ONNX

license:mit
6
0

dinov2-large-ONNX

This is an ONNX version of facebook/dinov2-large. It was automatically converted and uploaded using this space.

6
0

owlvit-base-patch32-ONNX

This is an ONNX version of google/owlvit-base-patch32. It was automatically converted and uploaded using this space.

NaNK
6
0

bart-large-cnn-ONNX

This is an ONNX version of facebook/bart-large-cnn. It was automatically converted and uploaded using this space.

6
0

distilbart-mnli-12-3-ONNX

NaNK
6
0

TinyStories-Instruct-33M-ONNX

This is an ONNX version of roneneldan/TinyStories-Instruct-33M. It was automatically converted and uploaded using this space.

6
0

Qwen2.5-0.5B-Instruct-Gensyn-Swarm-mammalian_lithe_barracuda-ONNX

NaNK
6
0

swearwords-detection-model-ONNX

This is an ONNX version of keatrean/swearwords-detection-model. It was automatically converted and uploaded using this space.

6
0

rubert-tiny-ONNX

6
0

prem-1B-SQL-ONNX

NaNK
llama
6
0

Gpt2-Wikitext-9180-ONNX

This is an ONNX version of prithivMLmods/Gpt2-Wikitext-9180. It was automatically converted and uploaded using this space.

NaNK
6
0

whisper-large-v3-turbo-ONNX

5
2

Qwen2.5-0.5B-Instruct-ONNX-MHA

NaNK
5
2

rfdetr_base-ONNX

license:apache-2.0
5
2

gliner_large-v2

NaNK
5
1

gliner-multitask-large-v0.5

NaNK
5
1

AMD-OLMo-1B

NaNK
5
1

DeepScaleR-1.5B-Preview-ONNX

NaNK
5
1

layoutlmv3-base-ONNX

This is an ONNX version of microsoft/layoutlmv3-base. It was automatically converted and uploaded using this space.

NaNK
5
1

bert-base-multilingual-cased-ONNX

Pretrained model on the top 104 languages with the largest Wikipedia using a masked language modeling (MLM) objective. It was introduced in this paper and first released in this repository. This model is case sensitive: it makes a difference between english and English. Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by the Hugging Face team. BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it was pretrained with two objectives: - Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run the entire masked sentence through the model and has to predict the masked words. This is different from traditional recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the sentence. - Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to predict if the two sentences were following each other or not. This way, the model learns an inner representation of the languages in the training set that can then be used to extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard classifier using the features produced by the BERT model as inputs. You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task. See the model hub to look for fine-tuned versions on a task that interests you. Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering. For tasks such as text generation you should look at model like GPT2. If you haven't already, you can install the Transformers.js JavaScript library from NPM using: You can then use this model directly with a pipeline for masked language modeling: The BERT model was pretrained on the 104 languages with the largest Wikipedias. You can find the complete list here.

license:apache-2.0
5
0

rtdetr_r18vd_coco_o365

NaNK
5
0

rtdetr_r34vd

5
0

rtdetr_r101vd_coco_o365

NaNK
5
0

opus-mt-tc-big-tr-en

license:cc-by-4.0
5
0

hiera-huge-224-in1k-hf

5
0

maskformer-swin-base-coco

5
0

Conan-embedding-v1

NaNK
5
0

MobileLLM-350M

5
0

AMD-OLMo-1B-SFT

NaNK
5
0

camembertv2-base-ftb-ner

Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

5
0

ijepa_vith16_1k

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Image feature extraction with `onnx-community/ijepavith161k`. Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

5
0

ijepa_vith14_22k

If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Example: Image feature extraction with `onnx-community/ijepavith1422k`. Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

5
0

Pleias-1.2b-Preview

NaNK
llama
5
0

paligemma2-3b-pt-896

NaNK
5
0

SmallThinker-3B-Preview-ONNX

NaNK
5
0

glm-edge-1.5b-chat-ONNX

NaNK
5
0

siglip2-large-patch16-384-ONNX

NaNK
5
0

siglip2-so400m-patch14-224-ONNX

NaNK
5
0

siglip2-so400m-patch16-384-ONNX

NaNK
5
0

lite-whisper-large-v3-fast-ONNX

license:apache-2.0
5
0

lite-whisper-large-v3-acc-ONNX

license:apache-2.0
5
0

DeepCoder-1.5B-Preview-ONNX

NaNK
5
0

deepseek-coder-1.3b-instruct-ONNX

NaNK
llama
5
0

vit-tiny-patch16-224-ONNX

NaNK
5
0