* feat(policies): Initial setup to push policies to hub with tags and model card * feat: add dataset that is used to train * Add model template summary * fix: Update link model_card template * fix: remove print * fix: change import name * fix: add model summary in template * fix: minor text * fix: comments Lucain * fix: feedback steven * fix: restructure push to hub * fix: remove unneeded changes * fix: import * fix: import 2 * Add MANIFEST.in * fix: feedback pr * Fix tests * tests: Add smolvla end-to-end test * Fix: smolvla test * fix test name * fix policy tests * Add push to hub false policy tests * Do push to hub cleaner * fix(ci): add push_to_hub false in tests --------- Co-authored-by: Steven Palma <steven.palma@huggingface.co>
3.9 KiB
|
||
|---|---|---|
Model Card for {{ model_name | default("Model ID", true) }}
{% if model_name == "smolvla" %} SmolVLA is a compact, efficient vision-language-action model that achieves competitive performance at reduced computational costs and can be deployed on consumer-grade hardware. {% elif model_name == "act" %} Action Chunking with Transformers (ACT) is an imitation-learning method that predicts short action chunks instead of single steps. It learns from teleoperated data and often achieves high success rates. {% elif model_name == "tdmpc" %} TD-MPC combines model-free and model-based approaches to improve sample efficiency and performance in continuous control tasks by using a learned latent dynamics model and terminal value function. {% elif model_name == "diffusion" %} Diffusion Policy treats visuomotor control as a generative diffusion process, producing smooth, multi-step action trajectories that excel at contact-rich manipulation. {% elif model_name == "vqbet" %} VQ-BET combines vector-quantised action tokens with Behaviour Transformers to discretise control and achieve data-efficient imitation across diverse skills. {% elif model_name == "pi0" %} Pi0 is a generalist vision-language-action transformer that converts multimodal observations and text instructions into robot actions for zero-shot task transfer. {% elif model_name == "pi0fast" %} Pi0-Fast is a variant of Pi0 that uses a new tokenization method called FAST, which enables training of an autoregressive vision-language-action policy for high-frequency robotic tasks with improved performance and reduced training time. {% elif model_name == "sac" %} Soft Actor-Critic (SAC) is an entropy-regularised actor-critic algorithm offering stable, sample-efficient learning in continuous-control environments. {% elif model_name == "reward_classifier" %} A reward classifier is a lightweight neural network that scores observations or trajectories for task success, providing a learned reward signal or offline evaluation when explicit rewards are unavailable. {% else %} Model type not recognized — please update this template. {% endif %}
This policy has been trained and pushed to the Hub using LeRobot. See the full documentation at LeRobot Docs.
How to Get Started with the Model
For a complete walkthrough, see the training guide. Below is the short version on how to train and run inference/eval:
Train from scratch
python lerobot/scripts/train.py \
--dataset.repo_id=${HF_USER}/<dataset> \
--policy.type=act \
--output_dir=outputs/train/<desired_policy_repo_id> \
--job_name=lerobot_training \
--policy.device=cuda \
--policy.repo_id=${HF_USER}/<desired_policy_repo_id>
--wandb.enable=true
Writes checkpoints to outputs/train/<desired_policy_repo_id>/checkpoints/.
Evaluate the policy/run inference
python -m lerobot.record \
--robot.type=so100_follower \
--dataset.repo_id=<hf_user>/eval_<dataset> \
--policy.path=<hf_user>/<desired_policy_repo_id> \
--episodes=10
Prefix the dataset repo with eval_ and supply --policy.path pointing to a local or hub checkpoint.
Model Details
- License: {{ license | default("[More Information Needed]", true) }}