* feat(policies): Initial setup to push policies to hub with tags and model card

* feat: add dataset that is used to train

* Add model template summary

* fix: Update link model_card template

* fix: remove print

* fix: change import name

* fix: add model summary in template

* fix: minor text

* fix: comments Lucain

* fix: feedback steven

* fix: restructure push to hub

* fix: remove unneeded changes

* fix: import

* fix: import 2

* Add MANIFEST.in

* fix: feedback pr

* Fix tests

* tests: Add smolvla end-to-end test

* Fix: smolvla test

* fix test name

* fix policy tests

* Add push to hub false policy tests

* Do push to hub cleaner

* fix(ci): add push_to_hub false in tests

---------

Co-authored-by: Steven Palma <steven.palma@huggingface.co>

2025-06-26 14:36:16 +02:00

3.9 KiB

Raw Blame History

card_data

# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1 # Doc / guide: https://huggingface.co/docs/hub/model-cards {{ card_data }}

Model Card for {{ model_name | default("Model ID", true) }}

{% if model_name == "smolvla" %} SmolVLA is a compact, efficient vision-language-action model that achieves competitive performance at reduced computational costs and can be deployed on consumer-grade hardware. {% elif model_name == "act" %} Action Chunking with Transformers (ACT) is an imitation-learning method that predicts short action chunks instead of single steps. It learns from teleoperated data and often achieves high success rates. {% elif model_name == "tdmpc" %} TD-MPC combines model-free and model-based approaches to improve sample efficiency and performance in continuous control tasks by using a learned latent dynamics model and terminal value function. {% elif model_name == "diffusion" %} Diffusion Policy treats visuomotor control as a generative diffusion process, producing smooth, multi-step action trajectories that excel at contact-rich manipulation. {% elif model_name == "vqbet" %} VQ-BET combines vector-quantised action tokens with Behaviour Transformers to discretise control and achieve data-efficient imitation across diverse skills. {% elif model_name == "pi0" %} Pi0 is a generalist vision-language-action transformer that converts multimodal observations and text instructions into robot actions for zero-shot task transfer. {% elif model_name == "pi0fast" %} Pi0-Fast is a variant of Pi0 that uses a new tokenization method called FAST, which enables training of an autoregressive vision-language-action policy for high-frequency robotic tasks with improved performance and reduced training time. {% elif model_name == "sac" %} Soft Actor-Critic (SAC) is an entropy-regularised actor-critic algorithm offering stable, sample-efficient learning in continuous-control environments. {% elif model_name == "reward_classifier" %} A reward classifier is a lightweight neural network that scores observations or trajectories for task success, providing a learned reward signal or offline evaluation when explicit rewards are unavailable. {% else %} Model type not recognized — please update this template. {% endif %}

This policy has been trained and pushed to the Hub using LeRobot. See the full documentation at LeRobot Docs.

How to Get Started with the Model

For a complete walkthrough, see the training guide. Below is the short version on how to train and run inference/eval:

Train from scratch

python lerobot/scripts/train.py \
  --dataset.repo_id=${HF_USER}/<dataset> \
  --policy.type=act \
  --output_dir=outputs/train/<desired_policy_repo_id> \
  --job_name=lerobot_training \
  --policy.device=cuda \
  --policy.repo_id=${HF_USER}/<desired_policy_repo_id>
  --wandb.enable=true

Writes checkpoints to outputs/train/<desired_policy_repo_id>/checkpoints/.

Evaluate the policy/run inference

python -m lerobot.record \
  --robot.type=so100_follower \
  --dataset.repo_id=<hf_user>/eval_<dataset> \
  --policy.path=<hf_user>/<desired_policy_repo_id> \
  --episodes=10

Prefix the dataset repo with eval_ and supply --policy.path pointing to a local or hub checkpoint.

Model Details

License: {{ license | default("[More Information Needed]", true) }}

3.9 KiB Raw Blame History