GigaChat3-10B-A1.8B-GGUF

Name: GigaChat3-10B-A1.8B-GGUF
Author: ubergarm

10.0B

ik_llama.cpp

ubergarm

Language Model

OTHER

10B params

New

46 downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

23GB+ RAM

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile

4-6GB RAM

Laptop

16GB RAM

Server

GPU

Minimum Recommended

10GB+ RAM

Code Examples

bash

#!/usr/bin/env bash

./build/bin/llama-quantize \
    --pure \
    /mnt/data/models/ubergarm/GigaChat3-10B-A1.8B-GGUF/GigaChat3-10B-A1.8B-BF16.gguf \
    /mnt/data/models/ubergarm/GigaChat3-10B-A1.8B-GGUF/GigaChat3-10B-A1.8B-Q8_0.gguf \
    Q8_0 \
    128

IQ5_K 7.598 GiB (6.115 BPW)bash

#!/usr/bin/env bash

custom="
## Attention [0-25] (GPU)
blk\..*\.attn.*\.weight=q8_0

## First Single Dense Layer [0] (GPU)
blk\..*\.ffn_down\.weight=q8_0
blk\..*\.ffn_(gate|up)\.weight=q8_0

## Shared Expert [1-25] (GPU)
blk\..*\.ffn_down_shexp\.weight=q8_0
blk\..*\.ffn_(gate|up)_shexp\.weight=q8_0

## Routed Experts [1-25] (CPU)
blk\..*\.ffn_down_exps\.weight=iq6_k
blk\..*\.ffn_(gate|up)_exps\.weight=iq5_k

token_embd\.weight=iq6_k
output\.weight=iq6_k
"""

custom=$(
  echo "$custom" | grep -v '^#' | \
  sed -Ez 's:\n+:,:g;s:,$::;s:^,::'
)

./build/bin/llama-quantize \
    --custom-q "$custom" \
    --imatrix /mnt/data/models/ubergarm/GigaChat3-10B-A1.8B-GGUF/imatrix-GigaChat3-10B-A1.8B-BF16.dat \
    /mnt/data/models/ubergarm/GigaChat3-10B-A1.8B-GGUF/GigaChat3-10B-A1.8B-BF16.gguf \
    /mnt/data/models/ubergarm/GigaChat3-10B-A1.8B-GGUF/GigaChat3-10B-A1.8B-IQ5_K.gguf \
    IQ5_K \
    64

Quick Startbashllama.cpp

# Example running on mainline llama.cpp CPU-only
./build/bin/llama-server \
    --model "$model"\
    --alias ubergarm/GigaChat3-10B-A1.8B-GGUF \
    --ctx-size 32768 \
    --parallel 1 \
    --threads 8 \
    --host 127.0.0.1 \
    --port 8080 \
    --no-mmap

# for full offload onto GPU just add -ngl 99 and set threads to 1

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.