LongCat-Flash-Lite-FP8

Name: LongCat-Flash-Lite-FP8
Author: meituan-longcat

license:mit

meituan-longcat

Language Model

OTHER

New

0 downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

Unknown

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Code Examples

bash

pip install -U transformers==4.57.6 accelerate==1.10.0

bash

cd sgl-kernel
python3 -m uv build --wheel --color=always --no-build-isolation \
        -Ccmake.define.SGL_KERNEL_ENABLE_SM90A=1 \
        -Ccmake.define.CMAKE_POLICY_VERSION_MINIMUM=3.5 \
        -Cbuild-dir=build .
pip3 install dist/sgl_kernel-0.3.21-cp310-abi3-linux_x86_64.whl --force-reinstall

python

python3 -m sglang.launch_server \
    --model meituan-longcat/LongCat-Flash-Lite \
    --port 8080 \
    --host 0.0.0.0 \
    --mem-fraction-static 0.9 \
    --max-running-requests 64 \
    --trust-remote-code \
    --skip-server-warmup \
    --attention-backend flashinfer \
    --ep 8 \
    --tp 8 \
    --disable-cuda-graph

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.