DeepSeek-R1-0528-GPTQ-Int4-Int8Mix-Compact

16

5

license:mit

by

QuantTrio

Language Model

OTHER

2501.12948B params

New

16 downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

5591GB+ RAM

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile

4-6GB RAM

Laptop

16GB RAM

Server

GPU

Minimum Recommended

2330GB+ RAM

Code Examples

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

【💡Notes on New VLLM Versions💡】textvllm

</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

text

</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.