DeepSeek-R1-0528-GPTQ-Int4-Int8Mix-Compact

16
5
license:mit
by
QuantTrio
Language Model
OTHER
2501.12948B params
New
16 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
5591GB+ RAM
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
2330GB+ RAM

Code Examples

【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
【💡Notes on New VLLM Versions💡】textvllm
</div>

<div style="
    background: rgba(255, 0, 200, 0.15);
    padding: 16px;
    border-radius: 6px;
    border: 1px solid rgba(255, 0, 200, 0.3);
    margin: 16px 0;
">
### 【💡 Patch for gptq_marlin.py💡】

At present, vllm==0.9.0 lacks support for per-layer quantization configurations for the moe module, which will lead to errors when loading the model.
I have implemented a simple fix by adding the get_moe_quant_method function to the gptq_marlin.py file.

Until the PR is merged, please replace the gptq_marlin.py file in your installation with the attached version, placing it at:
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】
text
</div>


### 【Model List】

| FILE SIZE    | LATEST UPDATE TIME       |
|---------|--------------|
| `414GB` | `2025-06-01` |



### 【Model Download】

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.