GLM-4.5-GGUF
305
6
4.5B
Q8
license:mit
by
AesSedai
Language Model
OTHER
4.5B params
New
305 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
11GB+ RAM
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
5GB+ RAM
Code Examples
IQ2_KT: 109.269 GiB (2.619 BPW): Lost the results somewhere, oops.bash
# 93 Repeating Layers [0-92]
# Attention
blk\..*\.attn_q.*=iq4_k
blk\..*\.attn_k.*=iq6_k
blk\..*\.attn_v.*=iq6_k
blk\..*\.attn_output.*=iq5_ks
# First 3 Dense Layers [0-2]
blk\..*\.ffn_down\.weight=iq4_ks
blk\..*\.ffn_(gate|up)\.weight=iq3_ks
# Shared Expert Layers [3-92]
blk\..*\.ffn_down_shexp\.weight=iq6_k
blk\..*\.ffn_(gate|up)_shexp\.weight=iq6_k
# Routed Experts Layers [3-92]
blk\..*\.ffn_down_exps\.weight=iq3_kt
blk\..*\.ffn_(gate|up)_exps\.weight=iq2_kt
# NextN MTP Layer [92]
blk\..*\.nextn\.embed_tokens\.weight=iq4_k
blk\..*\.nextn\.shared_head_head\.weight=iq6_k
blk\..*\.nextn\.eh_proj\.weight=iq6_k
# Non-Repeating Layers
token_embd\.weight=iq4_k
output\.weight=iq6_kDeploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.