HyperCLOVAX-SEED-Vision-Instruct-3B

57.9K
212
3.0B
by
naver-hyperclovax
Language Model
OTHER
3B params
Fair
58K downloads
Community-tested
Edge AI:
Mobile
Laptop
Server
7GB+ RAM
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
3GB+ RAM

Code Examples

install latest commit (e.g. v0.9.0)bashvllm
pyenv virtualenv 3.10.2 .vllm
pyenv activate .vllm
sudo apt-get install -y kmod
pip install --upgrade setuptools wheel pip
pip install setuptools_scm

# install latest commit (e.g. v0.9.0)
VLLM_USE_PRECOMPILED=1 pip install -e .[serve] --cache-dir=/mnt/tmp
pip install -U pynvml
pip install timm av decord

# or install previous commit (e.g. v0.8.4)
pip install -r ./requirements/build.txt
pip install -r ./requirements/common.txt
pip install -r ./requirements/cuda.txt
pip install flash_attn==2.7.4.post1
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/
export VLLM_COMMIT=dc1b4a6f1300003ae27f033afbdff5e2683721ce
export VLLM_PRECOMPILED_WHEEL_LOCATION=https://wheels.vllm.ai/${VLLM_COMMIT}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
VLLM_USE_PRECOMPILED=1 pip install -e .[serve] --cache-dir=/mnt/tmp
pip install -U pynvml
pip install timm av decord

# Then launch api
MODEL=your/mode/path
export ATTENTION_BACKEND=FLASH_ATTN_VLLM_V1
VLLM_USE_V1=1 VLLM_ATTENTION_BACKEND=${ATTENTION_BACKEND} CUDA_VISIBLE_DEVICES=0,1 python -m vllm.entrypoints.openai.api_server \
    --seed 20250525 \
    --port ${PORT} \
    --allowed-local-media-path $ALLOWED_LOCAL_MEDIA_PATH \
    --max-model-len 8192 \
    --max-num-batched-tokens 8192 \
    --max-num-seqs 128 \
    --max-parallel-loading-workers 128 \
    --limit-mm-per-prompt.image="32" \
    --limit-mm-per-prompt.viedo="32" \
    --max-num-frames 256 \
    --tensor-parallel-size 1 \
    --data-parallel-size 1 \
    --model ${MODEL} \
    --dtype float16 \
    --trust-remote-code \
    --chat-template-content-format "openai" \
    --download-dir $DONWLOAD_DIR

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.