smolvla-jetbot

Name: smolvla-jetbot
Author: shraavb

license:apache-2.0

shraavb

Other

OTHER

New

1 downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

Unknown

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Code Examples

Architecturetext

Base Model: HuggingFaceTB/SmolVLM-500M-Instruct (Frozen)
├── Vision Encoder: SigLIP-400M
├── Language Model: SmolLM-360M
└── Hidden Size: 960

Action Head (Trainable, ~123K parameters):
├── Linear(960 → 128)
├── ReLU + Dropout(0.1)
├── Linear(128 → 2)
└── Tanh → outputs in [-1, 1]

Output: [left_motor_speed, right_motor_speed]

Training Commandbash

python -m server.vla_server.fine_tuning.train_smolvla \
    --data-dir dataset_vla \
    --output-dir models/smolvla_jetbot \
    --epochs 20 \
    --batch-size 2 \
    --lr 5e-5

How to Usebash

pip install transformers torch pillow

How to Usepythontransformers

from transformers import AutoProcessor, AutoModel
import torch
from PIL import Image

# Load model and processor
processor = AutoProcessor.from_pretrained("HuggingFaceTB/SmolVLM-500M-Instruct")
base_model = AutoModel.from_pretrained("HuggingFaceTB/SmolVLM-500M-Instruct")

# Load fine-tuned action head
action_head = torch.load("path/to/action_head.pt")

# Prepare inputs
image = Image.open("camera_image.jpg")
instruction = "go forward"

inputs = processor(
    images=image,
    text=f"<image>\n{instruction}",
    return_tensors="pt"
)

# Get hidden states
with torch.no_grad():
    outputs = base_model(**inputs, output_hidden_states=True)
    hidden_states = outputs.hidden_states[-1][:, -1, :]  # Last token

    # Get motor commands
    actions = action_head(hidden_states)
    left_speed, right_speed = actions[0].tolist()

print(f"Left motor: {left_speed:.3f}, Right motor: {right_speed:.3f}")

Running the VLA Serverbash

# Start the inference server
python -m server.vla_server.server \
    --model-type smolvla \
    --fine-tuned \
    --model models/smolvla_jetbot/best

# The server accepts ZMQ requests with images and instructions

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.