GLM-4.7-NVFP4
122
2
license:mit
by
Salyut1
Language Model
OTHER
2B params
New
122 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
5GB+ RAM
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
2GB+ RAM
Code Examples
Path to the vLLM model filepythonvllm
import sys
import os
import re
# Path to the vLLM model file
path = '/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/glm4_moe.py'
if os.path.exists(path):
with open(path, 'r') as f:
lines = f.readlines()
target_str = 'param = params_dict[name]'
new_lines = []
patched = False
for line in lines:
# We look for the parameter loading line
if target_str in line and 'k_scale' not in line:
whitespace = re.match(r'^(\s*)', line).group(1)
# Inject logic: If asking for k_scale/v_scale and it's missing, skip
payload = f"{whitespace}if ('k_scale' in name or 'v_scale' in name) and name not in params_dict: continue\n"
new_lines.append(payload)
new_lines.append(line)
patched = True
else:
new_lines.append(line)
if patched:
with open(path, 'w') as f:
f.writelines(new_lines)
print(f"Successfully patched {path}")
else:
print("File already patched or target not found.")Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.