GLM-5-NVFP4
139.6K
24
license:mit
by
nvidia
Language Model
OTHER
5B params
Good
140K downloads
Production-ready
Edge AI:
Mobile
Laptop
Server
12GB+ RAM
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
5GB+ RAM
Code Examples
Usagebashvllm
vllm serve nvidia/GLM-5-NVFP4 --tensor-parallel-size 8 --trust-remote-code --enable-auto-tool-choice --tool-call-parser glm47 --reasoning-parser glm45 --enable-chunked-prefill --max-num-batched-tokens 131072 --gpu-memory-utilization 0.80bash
python3 -m sglang.launch_server --model nvidia/GLM-5-NVFP4 --tensor-parallel-size 8 --quantization modelopt_fp4 --tool-call-parser glm47 --reasoning-parser glm45 --trust-remote-code --chunked-prefill-size 131072 --mem-fraction-static 0.80Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.