lerobot

23 models • 5 total models in database
Sort by:

smolvla_base

SmolVLA: A vision-language-action model for affordable and efficient robotics This model has 450M parameters in total. You can use inside the LeRobot library. Before proceeding to the next steps, y...

imitation-learning
16,742
327

pi0fast-base

imitation-learning
4,357
16

pi05_libero_finetuned

imitation-learning
4,003
8

xvla-base

imitation-learning
3,111
19

pi05_base

These weights directly come from the Pytorch conversion script of openpi and their `pi05base` model. π₀.₅ is a Vision-Language-Action model with open-world generalization, from Physical Intelligence. The LeRobot implementation is adapted from their open source OpenPI repository. π₀.₅ represents a significant evolution from π₀, developed by Physical Intelligence to address a big challenge in robotics: open-world generalization. While robots can perform impressive tasks in controlled environments, π₀.₅ is designed to generalize to entirely new environments and situations that were never seen during training. As Physical Intelligence explains, the fundamental challenge isn't performing tasks of agility or dexterity, but generalization, the ability to correctly perform tasks in new settings with new objects. Consider a robot cleaning different homes: each home has different objects in different places. Generalization must occur at multiple levels: - Physical Level: Understanding how to pick up a spoon (by the handle) or plate (by the edge), even with unseen objects in cluttered environments - Semantic Level: Understanding task semantics, where to put clothes and shoes (laundry hamper, not on the bed), and what tools are appropriate for cleaning spills - Environmental Level: Adapting to "messy" real-world environments like homes, grocery stores, offices, and hospitals The breakthrough innovation in π₀.₅ is co-training on heterogeneous data sources. The model learns from: 1. Multimodal Web Data: Image captioning, visual question answering, object detection 2. Verbal Instructions: Humans coaching robots through complex tasks step-by-step 3. Subtask Commands: High-level semantic behavior labels (e.g., "pick up the pillow" for an unmade bed) 4. Cross-Embodiment Robot Data: Data from various robot platforms with different capabilities 5. Multi-Environment Data: Static robots deployed across many different homes 6. Mobile Manipulation Data: ~400 hours of mobile robot demonstrations This diverse training mixture creates a "curriculum" that enables generalization across physical, visual, and semantic levels simultaneously. Here's a complete training command for finetuning the base π₀.₅ model on your own dataset: If you use this model, please cite the original OpenPI work: This model follows the same license as the original OpenPI repository.

imitation-learning
2,968
46

diffusion_pusht

license:apache-2.0
2,188
48

pi0_libero_finetuned

This model which come from the Pytorch conversion script of openpi and their `pi0libero` model, has been finetuned on libero dataset. π₀ is a Vision-Language-Action model for general robot control, from Physical Intelligence. The LeRobot implementation is adapted from their open source OpenPI repository. π₀ represents a breakthrough in robotics as the first general-purpose robot foundation model developed by Physical Intelligence. Unlike traditional robots that are narrow specialists programmed for repetitive motions, π₀ is designed to be a generalist policy that can understand visual inputs, interpret natural language instructions, and control a variety of different robots across diverse tasks. - Flow Matching: Uses a novel method to augment pre-trained VLMs with continuous action outputs via flow matching (a variant of diffusion models) - Cross-Embodiment Training: Trained on data from 8 distinct robot platforms including UR5e, Bimanual UR5e, Franka, Bimanual Trossen, Bimanual ARX, Mobile Trossen, and Mobile Fibocom - Internet-Scale Pre-training: Inherits semantic knowledge from a pre-trained 3B parameter Vision-Language Model - High-Frequency Control: Outputs motor commands at up to 50 Hz for real-time dexterous manipulation For training π₀, you can use the standard LeRobot training script with the appropriate configuration: If you use this model, please cite the original OpenPI work: This model follows the same license as the original OpenPI repository.

imitation-learning
1,422
4

pi0_base

imitation-learning
1,261
13

xvla-widowx

imitation-learning
779
1

xvla-libero

imitation-learning
673
2

eagle2hg-processor-groot-n1p5

license:apache-2.0
585
2

pi05_libero_base

imitation-learning
551
6

pi0_libero_base

These weights directly come from the Pytorch conversion script of openpi and their `pi0libero` model. π₀ is a Vision-Language-Action model for general robot control, from Physical Intelligence. The LeRobot implementation is adapted from their open source OpenPI repository. π₀ represents a breakthrough in robotics as the first general-purpose robot foundation model developed by Physical Intelligence. Unlike traditional robots that are narrow specialists programmed for repetitive motions, π₀ is designed to be a generalist policy that can understand visual inputs, interpret natural language instructions, and control a variety of different robots across diverse tasks. - Flow Matching: Uses a novel method to augment pre-trained VLMs with continuous action outputs via flow matching (a variant of diffusion models) - Cross-Embodiment Training: Trained on data from 8 distinct robot platforms including UR5e, Bimanual UR5e, Franka, Bimanual Trossen, Bimanual ARX, Mobile Trossen, and Mobile Fibocom - Internet-Scale Pre-training: Inherits semantic knowledge from a pre-trained 3B parameter Vision-Language Model - High-Frequency Control: Outputs motor commands at up to 50 Hz for real-time dexterous manipulation For training π₀, you can use the standard LeRobot training script with the appropriate configuration: If you use this model, please cite the original OpenPI work: This model follows the same license as the original OpenPI repository.

imitation-learning
484
7

act_aloha_sim_transfer_cube_human

license:apache-2.0
281
30

pi0fast-libero

imitation-learning
264
5

pi05_libero_finetuned_quantiles

π₀.₅ (Pi05) Policy Finetuned with Quantile normalization This model which come from the Pytorch conversion script of openpi and their `pi05libero` model, has been finetuned for 6k steps on 8x H100 GPU's. π₀.₅ is a Vision-Language-Action model with open-world generalization, from Physical Intelligence. The LeRobot implementation is adapted from their open source OpenPI repository. π₀.₅ represents a significant evolution from π₀, developed by Physical Intelligence to address a big challenge in robotics: open-world generalization. While robots can perform impressive tasks in controlled environments, π₀.₅ is designed to generalize to entirely new environments and situations that were never seen during training. For more details, see the Physical Intelligence π₀.₅ blog post. This policy has been trained and pushed to the Hub using LeRobot. See the full documentation at LeRobot Docs. For a complete walkthrough, see the training guide. Below is the short version on how to train and run inference/eval:

imitation-learning
158
2

act_aloha_sim_insertion_human

license:apache-2.0
137
7

xvla-folding

imitation-learning
128
10

unitree-g1-mujoco

80
7

vqbet_pusht

license:apache-2.0
52
4

xvla-agibot-world

imitation-learning
31
3

xvla-google-robot

imitation-learning
22
5

diffusion_pusht_keypoints

license:apache-2.0
7
1