GigaChat3-10B-A1.8B-GGUF

46
8
10.0B
ik_llama.cpp
by
ubergarm
Language Model
OTHER
10B params
New
46 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
23GB+ RAM
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
10GB+ RAM

Code Examples

bash
#!/usr/bin/env bash

./build/bin/llama-quantize \
    --pure \
    /mnt/data/models/ubergarm/GigaChat3-10B-A1.8B-GGUF/GigaChat3-10B-A1.8B-BF16.gguf \
    /mnt/data/models/ubergarm/GigaChat3-10B-A1.8B-GGUF/GigaChat3-10B-A1.8B-Q8_0.gguf \
    Q8_0 \
    128
IQ5_K 7.598 GiB (6.115 BPW)bash
#!/usr/bin/env bash

custom="
## Attention [0-25] (GPU)
blk\..*\.attn.*\.weight=q8_0

## First Single Dense Layer [0] (GPU)
blk\..*\.ffn_down\.weight=q8_0
blk\..*\.ffn_(gate|up)\.weight=q8_0

## Shared Expert [1-25] (GPU)
blk\..*\.ffn_down_shexp\.weight=q8_0
blk\..*\.ffn_(gate|up)_shexp\.weight=q8_0

## Routed Experts [1-25] (CPU)
blk\..*\.ffn_down_exps\.weight=iq6_k
blk\..*\.ffn_(gate|up)_exps\.weight=iq5_k

token_embd\.weight=iq6_k
output\.weight=iq6_k
"""

custom=$(
  echo "$custom" | grep -v '^#' | \
  sed -Ez 's:\n+:,:g;s:,$::;s:^,::'
)

./build/bin/llama-quantize \
    --custom-q "$custom" \
    --imatrix /mnt/data/models/ubergarm/GigaChat3-10B-A1.8B-GGUF/imatrix-GigaChat3-10B-A1.8B-BF16.dat \
    /mnt/data/models/ubergarm/GigaChat3-10B-A1.8B-GGUF/GigaChat3-10B-A1.8B-BF16.gguf \
    /mnt/data/models/ubergarm/GigaChat3-10B-A1.8B-GGUF/GigaChat3-10B-A1.8B-IQ5_K.gguf \
    IQ5_K \
    64
Quick Startbashllama.cpp
# Example running on mainline llama.cpp CPU-only
./build/bin/llama-server \
    --model "$model"\
    --alias ubergarm/GigaChat3-10B-A1.8B-GGUF \
    --ctx-size 32768 \
    --parallel 1 \
    --threads 8 \
    --host 127.0.0.1 \
    --port 8080 \
    --no-mmap

# for full offload onto GPU just add -ngl 99 and set threads to 1

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.