HyperCLOVAX-SEED-Vision-Instruct-3B
57.9K
212
3.0B
—
by
naver-hyperclovax
Language Model
OTHER
3B params
Fair
58K downloads
Community-tested
Edge AI:
Mobile
Laptop
Server
7GB+ RAM
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
3GB+ RAM
Code Examples
install latest commit (e.g. v0.9.0)bashvllm
pyenv virtualenv 3.10.2 .vllm
pyenv activate .vllm
sudo apt-get install -y kmod
pip install --upgrade setuptools wheel pip
pip install setuptools_scm
# install latest commit (e.g. v0.9.0)
VLLM_USE_PRECOMPILED=1 pip install -e .[serve] --cache-dir=/mnt/tmp
pip install -U pynvml
pip install timm av decord
# or install previous commit (e.g. v0.8.4)
pip install -r ./requirements/build.txt
pip install -r ./requirements/common.txt
pip install -r ./requirements/cuda.txt
pip install flash_attn==2.7.4.post1
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/
export VLLM_COMMIT=dc1b4a6f1300003ae27f033afbdff5e2683721ce
export VLLM_PRECOMPILED_WHEEL_LOCATION=https://wheels.vllm.ai/${VLLM_COMMIT}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
VLLM_USE_PRECOMPILED=1 pip install -e .[serve] --cache-dir=/mnt/tmp
pip install -U pynvml
pip install timm av decord
# Then launch api
MODEL=your/mode/path
export ATTENTION_BACKEND=FLASH_ATTN_VLLM_V1
VLLM_USE_V1=1 VLLM_ATTENTION_BACKEND=${ATTENTION_BACKEND} CUDA_VISIBLE_DEVICES=0,1 python -m vllm.entrypoints.openai.api_server \
--seed 20250525 \
--port ${PORT} \
--allowed-local-media-path $ALLOWED_LOCAL_MEDIA_PATH \
--max-model-len 8192 \
--max-num-batched-tokens 8192 \
--max-num-seqs 128 \
--max-parallel-loading-workers 128 \
--limit-mm-per-prompt.image="32" \
--limit-mm-per-prompt.viedo="32" \
--max-num-frames 256 \
--tensor-parallel-size 1 \
--data-parallel-size 1 \
--model ${MODEL} \
--dtype float16 \
--trust-remote-code \
--chat-template-content-format "openai" \
--download-dir $DONWLOAD_DIRDeploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.