GLM-4.7-NVFP4

Name: GLM-4.7-NVFP4
Author: Salyut1

122

license:mit

Salyut1

Language Model

OTHER

2B params

New

122 downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

5GB+ RAM

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile

4-6GB RAM

Laptop

16GB RAM

Server

GPU

Minimum Recommended

2GB+ RAM

Code Examples

Path to the vLLM model filepythonvllm

import sys
import os
import re

# Path to the vLLM model file
path = '/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/glm4_moe.py'

if os.path.exists(path):
    with open(path, 'r') as f:
        lines = f.readlines()
    
    target_str = 'param = params_dict[name]'
    new_lines = []
    patched = False
    
    for line in lines:
        # We look for the parameter loading line
        if target_str in line and 'k_scale' not in line:
            whitespace = re.match(r'^(\s*)', line).group(1)
            
            # Inject logic: If asking for k_scale/v_scale and it's missing, skip
            payload = f"{whitespace}if ('k_scale' in name or 'v_scale' in name) and name not in params_dict: continue\n"
            
            new_lines.append(payload)
            new_lines.append(line)
            patched = True
        else:
            new_lines.append(line)
            
    if patched:
        with open(path, 'w') as f:
            f.writelines(new_lines)
        print(f"Successfully patched {path}")
    else:
        print("File already patched or target not found.")

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.