* chore: update .gitignore * chore: update pre-commit * chore(deps): update pyproject * fix(ci): multiple fixes * chore: pre-commit apply * chore: address review comments * Update pyproject.toml Co-authored-by: Ben Zhang <5977478+ben-z@users.noreply.github.com> Signed-off-by: Steven Palma <imstevenpmwork@ieee.org> * chore(deps): add todo --------- Signed-off-by: Steven Palma <imstevenpmwork@ieee.org> Co-authored-by: Ben Zhang <5977478+ben-z@users.noreply.github.com>
3.9 KiB
|
||
|---|---|---|
Model Card for {{ model_name | default("Model ID", true) }}
{% if model_name == "smolvla" %} SmolVLA is a compact, efficient vision-language-action model that achieves competitive performance at reduced computational costs and can be deployed on consumer-grade hardware. {% elif model_name == "act" %} Action Chunking with Transformers (ACT) is an imitation-learning method that predicts short action chunks instead of single steps. It learns from teleoperated data and often achieves high success rates. {% elif model_name == "tdmpc" %} TD-MPC combines model-free and model-based approaches to improve sample efficiency and performance in continuous control tasks by using a learned latent dynamics model and terminal value function. {% elif model_name == "diffusion" %} Diffusion Policy treats visuomotor control as a generative diffusion process, producing smooth, multi-step action trajectories that excel at contact-rich manipulation. {% elif model_name == "vqbet" %} VQ-BET combines vector-quantised action tokens with Behaviour Transformers to discretise control and achieve data-efficient imitation across diverse skills. {% elif model_name == "pi0" %} Pi0 is a generalist vision-language-action transformer that converts multimodal observations and text instructions into robot actions for zero-shot task transfer. {% elif model_name == "pi0fast" %} Pi0-Fast is a variant of Pi0 that uses a new tokenization method called FAST, which enables training of an autoregressive vision-language-action policy for high-frequency robotic tasks with improved performance and reduced training time. {% elif model_name == "sac" %} Soft Actor-Critic (SAC) is an entropy-regularised actor-critic algorithm offering stable, sample-efficient learning in continuous-control environments. {% elif model_name == "reward_classifier" %} A reward classifier is a lightweight neural network that scores observations or trajectories for task success, providing a learned reward signal or offline evaluation when explicit rewards are unavailable. {% else %} Model type not recognized — please update this template. {% endif %}
This policy has been trained and pushed to the Hub using LeRobot. See the full documentation at LeRobot Docs.
How to Get Started with the Model
For a complete walkthrough, see the training guide. Below is the short version on how to train and run inference/eval:
Train from scratch
python -m lerobot.scripts.train \
--dataset.repo_id=${HF_USER}/<dataset> \
--policy.type=act \
--output_dir=outputs/train/<desired_policy_repo_id> \
--job_name=lerobot_training \
--policy.device=cuda \
--policy.repo_id=${HF_USER}/<desired_policy_repo_id>
--wandb.enable=true
Writes checkpoints to outputs/train/<desired_policy_repo_id>/checkpoints/.
Evaluate the policy/run inference
python -m lerobot.record \
--robot.type=so100_follower \
--dataset.repo_id=<hf_user>/eval_<dataset> \
--policy.path=<hf_user>/<desired_policy_repo_id> \
--episodes=10
Prefix the dataset repo with eval_ and supply --policy.path pointing to a local or hub checkpoint.
Model Details
- License: {{ license | default("[More Information Needed]", true) }}