OpenVINO
bge-base-en-v1.5-fp16-ov
Model creator: BAAI Original model: bge-base-en-v1.5 Description This is bge-base-en-v1.5 model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.3.0 and higher Optimum Intel 1.25.2 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. You can find more detailed usage examples in OpenVINO Notebooks: The original model is distributed under MIT license. More details can be found in bge-base-en-v1.5. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Llama-3.1-8B-Instruct-FastDraft-150M-int8-ov
Qwen2.5-1.5B-Instruct-int4-ov
Qwen3-Embedding-0.6B-int8-ov
TinyLlama-1.1B-Chat-v1.0-int8-ov
distil-whisper-large-v3-int8-ov
whisper-large-v3-fp16-ov
Qwen3-Reranker-0.6B-fp16-ov
Qwen3-Reranker-0.6B-fp16-ov Model creator: Qwen Original model: Qwen3-Reranker-0.6B Description This is Qwen3-Reranker-0.6B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.4.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen3-Reranker-0.6B. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
DeepSeek-R1-Distill-Qwen-1.5B-int4-ov
whisper-base-int8-ov
DeepSeek-R1-Distill-Qwen-7B-int4-ov
Phi-3.5-mini-instruct-int4-ov
whisper-large-v3-int8-ov
Qwen3-4B-int4-ov
whisper-tiny-fp16-ov
Qwen3-8B-int4-ov
Mistral-7B-Instruct-v0.2-int4-ov
TinyLlama-1.1B-Chat-v1.0-int4-ov
whisper-medium-int8-ov
Phi-3-mini-4k-instruct-int4-ov
Mistral-7B-Instruct-v0.3-int4-cw-ov
Mistral-7B-Instruct-v0.3-int4-cw-ov Model creator: Mistral AI Original model: Mistral-7B-Instruct-v0.3 This is Mistral-7B-Instruct-v0.3 model converted to the OpenVINO™ IR (Intermediate Representation) format. > [!NOTE] > The model is optimized for inference on NPU using these instructions. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Intel® NPU Driver - Windows 32.0.100.4023 for Intel® Core™ Ultra processors and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen3-0.6B-fp16-ov
Qwen3-0.6B-fp16-ov Model creator: Qwen Original model: Qwen3-0.6B Description This is Qwen3-0.6B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen3-0.6B. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
gpt-oss-20b-int4-ov
bge-base-en-v1.5-int8-ov
Qwen3-0.6B-int4-ov
Qwen3-0.6B-int4-ov Model creator: Qwen Original model: Qwen3-0.6B Description This is Qwen3-0.6B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen3-0.6B. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
bge-reranker-base-int8-ov
Qwen3-8B-int4-cw-ov
Phi-4-mini-instruct-int4-ov
Phi-3.5-vision-instruct-int4-ov
Model creator: Microsoft Original model: Phi-3.5-vision-instruct This is microsoft/Phi-3.5-vision-instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 using Activation Aware Quantization (AWQ) by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: mode: INT4ASYM ratio: 1.0 groupsize: 128 awq: True dataset: contextual numsamples: 32 The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.0.0 and higher Optimum Intel 1.21.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: The original model is distributed under MIT license. More details can be found in original model card.
Qwen3-Coder-30B-A3B-Instruct-int4-ov
whisper-tiny-int8-ov
whisper-base-fp16-ov
Phi-3.5-mini-instruct-int4-cw-ov
Phi-3.5-mini-instruct-int4-cw-ov Model creator: microsoft Original model: Phi-3.5-mini-instruct Description This is Phi-3.5-mini-instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. > [!NOTE] > The model is optimized for inference on NPU using these instructions. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Intel® NPU Driver - Windows 32.0.100.4023 for Intel® Core™ Ultra processors and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Check the original model card for original model card for limitations. The original model is distributed under mit license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen3-30B-A3B-Instruct-2507-int4-ov
whisper-medium-fp16-ov
whisper-small-fp16-ov
Qwen2.5-7B-Instruct-int4-ov
Phi-3-mini-4k-instruct-int4-cw-ov
neural-chat-7b-v3-3-int4-ov
whisper-small-int8-ov
Mistral-7B-Instruct-v0.2-int4-cw-ov
Mistral-7B-Instruct-v0.2-int4-cw-ov Model creator: Mistral AI Original model: Mistral-7B-Instruct-v0.2 This is Mistral-7B-Instruct-v0.2 model converted to the OpenVINO™ IR (Intermediate Representation) format. > [!NOTE] > The model is optimized for inference on NPU using these instructions. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Intel® NPU Driver - Windows 32.0.100.4023 for Intel® Core™ Ultra processors and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
InternVL2-1B-int8-ov
Model creator: OpenGVLab Original model: InternVL2-1B This is OpenGVLab/InternVL2-1B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: mode: INT8ASYM The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.0.0 and higher Optimum Intel 1.21.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under MIT license. More details can be found in original model card.
open_llama_7b_v2-int4-ov
phi-2-fp16-ov
whisper-large-v3-int4-ov
Phi-3-mini-4k-instruct-fp16-ov
TinyLlama-1.1B-Chat-v1.0-fp16-ov
Qwen3-Embedding-0.6B-fp16-ov
Qwen3-Embedding-0.6B-fp16-ov Model creator: Qwen Original model: Qwen3-Embedding-0.6B Description This is Qwen3-Embedding-0.6B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.4.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen3-Embedding-0.6B. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen3-8B-int8-ov
DeepSeek-R1-Distill-Qwen-1.5B-int4-cw-ov
DeepSeek-R1-Distill-Qwen-1.5B-int4-cw-ov Model creator: DeepSeek Original model: DeepSeek-R1-Distill-Qwen-1.5B This is DeepSeek-R1-Distill-Qwen-1.5B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. > [!NOTE] > The model is optimized for inference on NPU using these instructions. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Intel® NPU Driver - Windows 32.0.100.4023 for Intel® Core™ Ultra processors and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Check the original model card for original model card for limitations. The original model is distributed under mit license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Phi-3-mini-FastDraft-50M-int8-ov
Qwen3-4B-fp16-ov
bert-base-uncased-sst2-unstructured80-int8-ov
Qwen3-1.7B-int4-ov
Qwen3-1.7B-int4-ov Model creator: Qwen Original model: Qwen3-1.7B Description This is Qwen3-1.7B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen3-1.7B. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Phi-3.5-vision-instruct-fp16-ov
Model creator: Microsoft Original model: Phi-3.5-vision-instruct This is microsoft/Phi-3.5-vision-instruct model converted to the OpenVINO™ IR (Intermediate Representation) format. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.0.0 and higher Optimum Intel 1.21.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: The original model is distributed under MIT license. More details can be found in original model card.
mistral-7b-instruct-v0.1-int8-ov
pythia-12b-fp16-ov
phi-2-int8-ov
Qwen2.5-7B-Instruct-fp16-ov
DeepSeek-R1-Distill-Qwen-7B-int4-cw-ov
DeepSeek-R1-Distill-Qwen-7B-int4-cw-ov Model creator: DeepSeek Original model: DeepSeek-R1-Distill-Qwen-7B This is DeepSeek-R1-Distill-Qwen-7B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. > [!NOTE] > The model is optimized for inference on NPU using these instructions. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Intel® NPU Driver - Windows 32.0.100.4023 for Intel® Core™ Ultra processors and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Check the original model card for original model card for limitations. The original model is distributed under mit license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen3-4B-int8-ov
gemma-2b-it-int4-ov
gemma-2b-it-int4-ov Model creator: google Original model: gemma-2b-it Description This is gemma-2b-it model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2024.5.0 and higher Optimum Intel 1.21.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the OpenVINO Large Language Model Inference Guide. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Check the original model card for original model card for limitations. The original model is distributed under gemma license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
DeepSeek-R1-Distill-Qwen-14B-int4-ov
mistral-7b-instruct-v0.1-int4-ov
whisper-tiny-int4-ov
InternVL2-2B-int4-ov
open_llama_3b_v2-int4-ov
Qwen2.5-1.5B-Instruct-int8-ov
InternVL2-1B-int4-ov
Qwen3-1.7B-fp16-ov
Qwen3-1.7B-fp16-ov Model creator: Qwen Original model: Qwen3-1.7B Description This is Qwen3-1.7B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen3-1.7B. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen3-14B-int8-ov
gpt-j-6b-int4-cw-ov
gpt-j-6b-int4-cw-ov Model creator: EleutherAI Original model: gpt-j-6b Description This is gpt-j-6b model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. > [!NOTE] > The model is optimized for inference on NPU using these instructions. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Intel® NPU Driver - Windows 32.0.100.4023 for Intel® Core™ Ultra processors and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
persimmon-8b-chat-fp16-ov
Mixtral-8x7B-Instruct-v0.1-int8-ov
Phi-3-mini-4k-instruct-int8-ov
Phi-3-mini-128k-instruct-int8-ov
Phi-3.5-vision-instruct-int8-ov
Qwen3-Embedding-0.6B-int4-cw-ov
open_llama_3b_v2-int8-ov
distil-whisper-large-v3-fp16-ov
whisper-medium-int4-ov
DeepSeek-R1-Distill-Qwen-7B-fp16-ov
Qwen3-14B-int4-ov
Qwen2.5-7B-Instruct-int8-ov
bge-reranker-base-fp16-ov
Phi-4-mini-instruct-int8-ov
Qwen2.5-Coder-1.5B-Instruct-int4-ov
Qwen2.5-Coder-1.5B-Instruct-int4-ov Model creator: Qwen Original model: Qwen2.5-Coder-1.5B-Instruct Description This is Qwen2.5-Coder-1.5B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.25.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-Coder-1.5B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
gemma-7b-it-int8-ov
pixtral-12b-fp16-ov
Model creator: mistral-community Original model: mistral-community/pixtral-12b This is mistral-community/pixtral-12b model converted to the OpenVINO™ IR (Intermediate Representation) format. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
distil-whisper-large-v2-fp16-ov
distil-whisper-large-v3-int4-ov
open_llama_7b_v2-fp16-ov
Qwen3-14B-fp16-ov
gemma-2b-it-int8-ov
gemma-2b-it-int8-ov Model creator: google Original model: gemma-2b-it Description This is gemma-2b-it model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2024.5.0 and higher Optimum Intel 1.21.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the OpenVINO Large Language Model Inference Guide. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Check the original model card for original model card for limitations. The original model is distributed under gemma license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-14B-Instruct-int8-ov
phi-2-int4-ov
Qwen3-pruned-6L-from-0.6B-int8-ov
This is a pruned model, originating from the Qwen/Qwen3-0.6B. The model was built to accompany Qwen/Qwen3-8B and to be used as a draft model in the context of Speculative Decoding. The pruning was performed by applying the findings from recent layer-wise pruning research, see one of the relevant publications, followed by the accuracy recovery fine-tuning over synthetic data generated by the target model Qwen/Qwen3-8B. Qwen3-pruned-6L-from-0.6B-int8-ov is a model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to int8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: mode: INT8ASYM For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2 and higher Optimum Intel 1.25.3 and higher 1. Install packages required for using OpenVINO GenAI with Speculative decoding: 3. Run model inference using the speculative decoding and specify the pipeline parameters: More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The model is distributed under the Intel Research Use License Agreement. The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen3-0.6B. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
gemma-7b-it-int4-ov
gemma-7b-it-int4-ov Model creator: Google Original model: gemma-7b-it Description This is gemma-7b-it model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2024.4.0 and higher Optimum Intel 1.20.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the OpenVINO Large Language Model Inference Guide. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Limitations Check the original model card for original model card for limitations. The original model is distributed under gemma license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2-1.5B-int8-ov
Qwen3-8B-fp16-ov
whisper-base-int4-ov
Qwen3-1.7B-int8-ov
Qwen3-1.7B-int8-ov Model creator: Qwen Original model: Qwen3-1.7B Description This is Qwen3-1.7B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen3-1.7B. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
distil-whisper-large-v2-int8-ov
zephyr-7b-beta-int4-ov
Qwen3-0.6B-int8-ov
Qwen3-0.6B-int8-ov Model creator: Qwen Original model: Qwen3-0.6B Description This is Qwen3-0.6B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen3-0.6B. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
phi-4-int4-ov
phi-4-int4-ov Model creator: microsoft Original model: phi-4 Description This is phi-4 model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under mit license. More details can be found in phi-4. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
falcon-7b-instruct-int4-cw-ov
whisper-small.en-int8-ov
distil-whisper-large-v2-int4-ov
whisper-small-int4-ov
DeepSeek-R1-Distill-Qwen-1.5B-int8-ov
DeepSeek-R1-Distill-Qwen-14B-fp16-ov
Phi-4-reasoning-fp16-ov
Phi-4-reasoning-fp16-ov Model creator: microsoft Original model: Phi-4-reasoning Description This is Phi-4-reasoning model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under mit license. More details can be found in Phi-4-reasoning. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Phi-4-reasoning-int8-ov
Phi-4-reasoning-int8-ov Model creator: microsoft Original model: Phi-4-reasoning Description This is Phi-4-reasoning model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under mit license. More details can be found in Phi-4-reasoning. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Phi-4-reasoning-int4-ov
Phi-4-reasoning-int4-ov Model creator: microsoft Original model: Phi-4-reasoning Description This is Phi-4-reasoning model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under mit license. More details can be found in Phi-4-reasoning. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Phi-3-medium-4k-instruct-int4-ov
mistral-7b-instruct-v0.1-fp16-ov
dolly-v2-3b-int4-ov
neural-chat-7b-v1-1-fp16-ov
Phi-3-medium-4k-instruct-fp16-ov
gemma-2-9b-it-int4-ov
InternVL2-1B-fp16-ov
Model creator: OpenGVLab Original model: InternVL2-1B This is OpenGVLab/InternVL2-1B model converted to the OpenVINO™ IR (Intermediate Representation) format. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.0.0 and higher Optimum Intel 1.21.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under MIT license. More details can be found in original model card.
DeepSeek-R1-Distill-Qwen-1.5B-fp16-ov
DeepSeek-R1-Distill-Qwen-7B-int8-ov
Qwen2-7B-Instruct-int4-ov
phi-4-int8-ov
phi-4-int8-ov Model creator: microsoft Original model: phi-4 Description This is phi-4 model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under mit license. More details can be found in phi-4. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-7B-Instruct-int4-ov
Phi-3-mini-128k-instruct-int4-ov
Phi-4-mini-FastDraft-120M-int8-ov
Qwen2.5-14B-Instruct-fp16-ov
Qwen2.5-14B-Instruct-fp16-ov Model creator: Qwen Original model: Qwen2.5-14B-Instruct Description This is Qwen2.5-14B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: - LLM - RAG text generation - Convert models from ModelScope to OpenVINO The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-14B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
falcon-7b-instruct-fp16-ov
RedPajama-INCITE-7B-Chat-fp16-ov
Phi-4-mini-instruct-fp16-ov
Qwen2-0.5B-Instruct-int4-ov
Qwen2.5-Coder-7B-Instruct-int8-ov
Qwen2.5-Coder-7B-Instruct-int8-ov Model creator: Qwen Original model: Qwen2.5-Coder-7B-Instruct Description This is Qwen2.5-Coder-7B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.25.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-Coder-7B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
mixtral-8x7b-instruct-v0.1-int4-ov
DeepSeek-R1-Distill-Qwen-14B-int8-ov
falcon-7b-instruct-int8-ov
gpt-j-6b-int8-ov
pythia-2.8b-fp16-ov
RedPajama-INCITE-Instruct-3B-v1-fp16-ov
RedPajama-INCITE-Instruct-3B-v1-int4-ov
Phi-3.5-mini-instruct-fp16-ov
Qwen2.5-1.5B-Instruct-fp16-ov
Qwen2.5-1.5B-Instruct-fp16-ov Model creator: Qwen Original model: Qwen2.5-1.5B-Instruct Description This is Qwen2.5-1.5B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-1.5B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
phi-4-fp16-ov
phi-4-fp16-ov Model creator: microsoft Original model: phi-4 Description This is phi-4 model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.1.0 and higher Optimum Intel 1.24.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under mit license. More details can be found in phi-4. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-0.5B-Instruct-int8-ov
Qwen2.5-Coder-0.5B-Instruct-int8-ov Model creator: Qwen Original model: Qwen2.5-Coder-0.5B-Instruct Description This is Qwen2.5-Coder-0.5B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.25.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-Coder-0.5B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-1.5B-Instruct-int8-ov
Qwen2.5-Coder-1.5B-Instruct-int8-ov Model creator: Qwen Original model: Qwen2.5-Coder-1.5B-Instruct Description This is Qwen2.5-Coder-1.5B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.25.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-Coder-1.5B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-14B-Instruct-int4-ov
whisper-medium.en-int8-ov
whisper-base.en-int8-ov
codegen25-7b-multi-fp16-ov
starcoder2-15b-int4-ov
Mistral-7B-Instruct-v0.2-fp16-ov
pythia-1b-int4-ov
gemma-2-9b-it-fp16-ov
Phi-3.5-mini-instruct-int8-ov
Qwen2-VL-7B-Instruct-int8-ov
Model creator: Qwen Original model: Qwen/Qwen2-VL-7B-Instruct This is Qwen/Qwen2-VL-7B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2-VL-7B-Instruct-fp16-ov
Model creator: Qwen Original model: Qwen/Qwen2-VL-7B-Instruct This is Qwen/Qwen2-VL-7B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-0.5B-Instruct-int4-ov
Qwen2.5-Coder-0.5B-Instruct-int4-ov Model creator: Qwen Original model: Qwen2.5-Coder-0.5B-Instruct Description This is Qwen2.5-Coder-0.5B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.25.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-Coder-0.5B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-14B-Instruct-fp16-ov
codegen2-3_7B_P-int8-ov
llava-v1.6-mistral-7b-hf-int4-ov
Model creator: llava-hf Original model: llava-v1.6-mistral-7b-hf This is llava-hf/llava-v1.6-mistral-7b-hf model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: The provided OpenVINO™ IR model is compatible with: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
whisper-medium.en-int4-ov
whisper-medium.en-fp16-ov
mpt-7b-int4-ov
gpt-neox-20b-int8-ov
Phi-3-mini-128k-instruct-fp16-ov
RedPajama-INCITE-7B-Instruct-fp16-ov
RedPajama-INCITE-7B-Instruct-int4-ov
neural-chat-7b-v1-1-int4-ov
gemma-7b-it-fp16-ov
Qwen2-1.5B-fp16-ov
Mistral-7B-Instruct-v0.3-int4-ov
Mistral-7B-Instruct-v0.3-int4-ov Model creator: Mistral AI Original model: Mistral-7B-Instruct-v0.3 This is Mistral-7B-Instruct-v0.3 model converted to the OpenVINO™ IR (Intermediate Representation) format. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
pixtral-12b-int8-ov
Model creator: mistral-community Original model: mistral-community/pixtral-12b This is mistral-community/pixtral-12b model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-1.5B-Instruct-fp16-ov
Qwen2.5-Coder-1.5B-Instruct-fp16-ov Model creator: Qwen Original model: Qwen2.5-Coder-1.5B-Instruct Description This is Qwen2.5-Coder-1.5B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.25.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-Coder-1.5B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-3B-Instruct-fp16-ov
Qwen2.5-Coder-3B-Instruct-fp16-ov Model creator: Qwen Original model: Qwen2.5-Coder-3B-Instruct Description This is Qwen2.5-Coder-3B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.25.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-Coder-3B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-3B-Instruct-int8-ov
Qwen2.5-Coder-3B-Instruct-int8-ov Model creator: Qwen Original model: Qwen2.5-Coder-3B-Instruct Description This is Qwen2.5-Coder-3B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.25.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-Coder-3B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-3B-Instruct-int4-ov
Qwen2.5-Coder-7B-Instruct-fp16-ov
Phi-3-mini-FastDraft-50M-int8-sym-ov
whisper-small.en-fp16-ov
starcoder2-7b-int4-ov
starcoder2-7b-fp16-ov
starcoder2-15b-int8-ov
starcoder2-15b-fp16-ov
mpt-7b-fp16-ov
gpt-neox-20b-fp16-ov
open_llama_7b_v2-int8-ov
gpt-j-6b-int4-ov
RedPajama-INCITE-7B-Chat-int4-ov
open_llama_3b_v2-fp16-ov
persimmon-8b-chat-int4-ov
persimmon-8b-chat-int8-ov
gemma-7b-int4-ov
gemma-7b-int8-ov
Qwen2.5-14B-Instruct-int4-ov
Qwen2-0.5B-int8-ov
Qwen2-0.5B-Instruct-fp16-ov
distil-small.en-fp16-ov
Mistral-7B-Instruct-v0.3-int8-ov
Mistral-7B-Instruct-v0.3-int8-ov Model creator: Mistral AI Original model: Mistral-7B-Instruct-v0.3 This is Mistral-7B-Instruct-v0.3 model converted to the OpenVINO™ IR (Intermediate Representation) format. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2.5-Coder-0.5B-Instruct-fp16-ov
Qwen2.5-Coder-0.5B-Instruct-fp16-ov Model creator: Qwen Original model: Qwen2.5-Coder-0.5B-Instruct Description This is Qwen2.5-Coder-0.5B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.25.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples You can find more detaild usage examples in OpenVINO Notebooks: The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen2.5-Coder-0.5B-Instruct. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Phi-3-mini-4k-instruct-int4-gq-ov
whisper-tiny.en-int8-ov
Qwen3-Reranker-0.6B-seq-cls-fp16-ov
Qwen3-Reranker-0.6B-seq-cls-fp16-ov Model creator: tomaarsen Original model: Qwen3-Reranker-0.6B-seq-cls Description This is Qwen3-Reranker-0.6B-seq-cls model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to FP16. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.4.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the Inference with Optimum Intel. The original model is distributed under Apache License Version 2.0 license. More details can be found in Qwen3-Reranker-0.6B-seq-cls. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
dolly-v2-3b-fp16-ov
dolly-v2-3b-int8-ov
codegen2-3_7B_P-int4-ov
codegen2-3_7B_P-fp16-ov
zephyr-7b-beta-fp16-ov
neural-chat-7b-v3-3-fp16-ov
mpt-7b-int8-ov
dolly-v2-7b-fp16-ov
falcon-7b-instruct-int4-ov
gpt-j-6b-fp16-ov
pythia-2.8b-int8-ov
pythia-6.9b-int8-ov
pythia-6.9b-int4-ov
pythia-6.9b-fp16-ov
neural-chat-7b-v1-1-int8-ov
Phi-3-medium-4k-instruct-int8-ov
gemma-7b-fp16-ov
gemma-7b-fp16-ov Model creator: Google Original model: gemma-7b The provided OpenVINO™ IR model is compatible with: OpenVINO version 2024.4.0 and higher Optimum Intel 1.20.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: For more examples and possible optimizations, refer to the OpenVINO Large Language Model Inference Guide. 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Check the original model card for original model card for limitations. The original model is distributed under gemma license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
codegen-6B-multi-int4-ov
codegen-6B-multi-fp16-ov
Qwen2.5-14B-Instruct-int8-ov
Qwen2-0.5B-int4-ov
Qwen2-1.5B-Instruct-fp16-ov
llava-v1.6-mistral-7b-hf-fp16-ov
Model creator: llava-hf Original model: llava-v1.6-mistral-7b-hf This is llava-hf/llava-v1.6-mistral-7b-hf model converted to the OpenVINO™ IR (Intermediate Representation) format. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
pixtral-12b-int4-ov
Model creator: mistral-community Original model: mistral-community/pixtral-12b This is mistral-community/pixtral-12b model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Qwen2-VL-7B-Instruct-int4-ov
Model creator: Qwen Original model: Qwen/Qwen2-VL-7B-Instruct This is Qwen/Qwen2-VL-7B-Instruct model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
neural-chat-7b-v3-3-int8-ov
Mistral-7B-Instruct-v0.2-int8-ov
Phi-3.5-mini-instruct-int4-gq-ov
DeepSeek-R1-Distill-Qwen-7B-nf4-ov
whisper-tiny.en-int4-ov
whisper-small.en-int4-ov
whisper-base.en-fp16-ov
zephyr-7b-beta-int8-ov
notus-7b-v1-fp16-ov
notus-7b-v1-int8-ov
starcoder2-7b-int8-ov
RedPajama-INCITE-Chat-3B-v1-fp16-ov
RedPajama-INCITE-Chat-3B-v1-int8-ov
RedPajama-INCITE-7B-Instruct-int8-ov
RedPajama-INCITE-7B-Chat-int8-ov
dolly-v2-12b-int8-ov
notus-7b-v1-int4-ov
pythia-12b-int8-ov
pythia-2.8b-int4-ov
gemma-2-9b-it-int8-ov
gemma-2b-it-fp16-ov
codegen-6B-multi-int8-ov
bloomz-3b-fp16-ov
bloomz-3b-int8-ov
bloomz-3b-int4-ov
InternVL2-2B-fp16-ov
Qwen2-0.5B-fp16-ov
Qwen2-1.5B-int4-ov
Qwen2-1.5B-Instruct-int4-ov
Qwen2-7B-Instruct-fp16-ov
Qwen2-7B-Instruct-int8-ov
distil-medium.en-fp16-ov
distil-small.en-int8-ov
distil-small.en-int8-ov Model creator: Distil-whisper Original model: distil-small.en Description This is distil-small.en model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.23.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Check the original model card for original model card for limitations. The original model is distributed under mit license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
distil-small.en-int4-ov
distil-small.en-int4-ov Model creator: Distil-whisper Original model: distil-small.en Description This is distil-small.en model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.23.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Check the original model card for original model card for limitations. The original model is distributed under mit license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
distil-medium.en-int8-ov
distil-medium.en-int8-ov Model creator: Distil-whisper Original model: distil-medium.en Description This is distil-medium.en model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.23.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Check the original model card for original model card for limitations. The original model is distributed under mit license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
distil-medium.en-int4-ov
distil-medium.en-int4-ov Model creator: Distil-whisper Original model: distil-medium.en Description This is distil-medium.en model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.23.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples Check the original model card for original model card for limitations. The original model is distributed under mit license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Mistral-7B-Instruct-v0.3-fp16-ov
Mistral-7B-Instruct-v0.3-fp16-ov Model creator: Mistral AI Original model: Mistral-7B-Instruct-v0.3 This is Mistral-7B-Instruct-v0.3 model converted to the OpenVINO™ IR (Intermediate Representation) format. The provided OpenVINO™ IR model is compatible with: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
InternVL2-4B-int4-ov
InternVL2-4B-fp16-ov
InternVL2-4B-int8-ov
InternVL2-8B-int4-ov
Model creator: OpenGVLab Original model: InternVL2-8B This is OpenGVLab/InternVL2-8B model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 using Activation Aware Quantization (AWQ) by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: mode: INT4ASYM ratio: 1.0 groupsize: 128 awq: True dataset: contextual numsamples: 32 The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under MIT license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
llava-v1.6-mistral-7b-hf-int8-ov
Model creator: llava-hf Original model: llava-v1.6-mistral-7b-hf This is llava-hf/llava-v1.6-mistral-7b-hf model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.2.0 and higher Optimum Intel 1.26.0 and higher 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under apache-2.0 license. More details can be found in original model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
stable-diffusion-v1-5-int8-ov
stable-diffusion-v1-5-fp16-ov
FLUX.1-schnell-int4-ov
Model creator: Black Forset Labs Original model: black-forest-labs/FLUX.1-schnell This is black-forest-labs/FLUX.1-schnell model converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT4 by NNCF. Weight compression was performed using `nncf.compressweights` with the following parameters: For more information on quantization, check the OpenVINO model optimization guide. The provided OpenVINO™ IR model is compatible with: OpenVINO version 2025.0.0 and higher Optimum Intel 1.22.0 and higher 1. Install packages required for using Optimum Intel integration with the OpenVINO backend: 1. Install packages required for using OpenVINO GenAI. More GenAI usage examples can be found in OpenVINO GenAI library docs and samples The original model is distributed under Apache 2.0 license. More details can be found in model card. Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.