Solar-Open-100B-NotaMoEQuant-Int4
1
—
by
nota-ai
Language Model
OTHER
100B params
New
0 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
224GB+ RAM
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
94GB+ RAM
Code Examples
Inferencebash
pip install -U transformers kernels torch accelerate auto-round==0.8.0vLLMbashvllm
VLLM_PRECOMPILED_WHEEL_LOCATION="https://github.com/vllm-project/vllm/releases/download/v0.12.0/vllm-0.12.0-cp38-abi3-manylinux_2_31_x86_64.whl" \
VLLM_USE_PRECOMPILED=1 \
uv pip install git+https://github.com/UpstageAI/[email protected]bashvllm
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
vllm serve nota-ai/Solar-Open-100B-NotaMoEQuant-Int4 \
--trust-remote-code \
--enable-auto-tool-choice \
--tool-call-parser solar_open \
--reasoning-parser solar_open \
--logits-processors vllm.model_executor.models.parallel_tool_call_logits_processor:ParallelToolCallLogitsProcessor \
--logits-processors vllm.model_executor.models.solar_open_logits_processor:SolarOpenTemplateLogitsProcessor \
--tensor-parallel-size 2 \
--max-num-seqs 64 \
--gpu-memory-utilization 0.8Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.