LongCat-Flash-Lite-FP8

1
license:mit
by
meituan-longcat
Language Model
OTHER
New
0 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Code Examples

bash
pip install -U transformers==4.57.6 accelerate==1.10.0
bash
cd sgl-kernel
python3 -m uv build --wheel --color=always --no-build-isolation \
        -Ccmake.define.SGL_KERNEL_ENABLE_SM90A=1 \
        -Ccmake.define.CMAKE_POLICY_VERSION_MINIMUM=3.5 \
        -Cbuild-dir=build .
pip3 install dist/sgl_kernel-0.3.21-cp310-abi3-linux_x86_64.whl --force-reinstall
python
python3 -m sglang.launch_server \
    --model meituan-longcat/LongCat-Flash-Lite \
    --port 8080 \
    --host 0.0.0.0 \
    --mem-fraction-static 0.9 \
    --max-running-requests 64 \
    --trust-remote-code \
    --skip-server-warmup \
    --attention-backend flashinfer \
    --ep 8 \
    --tp 8 \
    --disable-cuda-graph

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.