# π Text-to-3D: Generate 3D Assets from Text
Create **physically plausible 3D assets** from **text descriptions**, supporting a wide range of geometry, style, and material details.
---
## β‘ Command-Line Usage
**Basic CLI(recommend)**
Text-to-image model based on Stable Diffusion 3.5 MediumοΌ English prompts only. Usage requires agreement to the [model license (click βAcceptβ)](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium).
```bash
text3d-cli \
--prompts "small bronze figurine of a lion" "A globe with wooden base" "wooden table with embroidery" \
--n_image_retry 1 \
--n_asset_retry 1 \
--n_pipe_retry 1 \
--seed_img 0 \
--output_root outputs/textto3d
```
- `--n_image_retry`: Number of retries per prompt for text-to-image generation
- `--n_asset_retry`: Retry attempts for image-to-3D assets generation
- `--n_pipe_retry`: Pipeline retry for end-to-end 3D asset quality check
- `--seed_img`: Optional initial seed image for style guidance
- `--output_root`: Directory to save generated assets
For large-scale 3D asset generation, set `--n_image_retry=4` `--n_asset_retry=3` `--n_pipe_retry=2`, slower but better, via automatic checking and retries. For more diverse results, omit `--seed_img`.
You will get the following results:
"small bronze figurine of a lion"
"A globe with wooden base"
"wooden table with embroidery"
---
Kolors Model CLI (Supports Chinese & English Prompts):
```bash
bash embodied_gen/scripts/textto3d.sh \
--prompts "small bronze figurine of a lion" "A globe with wooden base and latitude and longitude lines" "ζ©θ²η΅ε¨ζι»οΌζ磨ζη»θ" \
--output_root outputs/textto3d_k
```
> Models with more permissive licenses can be found in `embodied_gen/models/image_comm_model.py`.
The generated results are organized as follows:
```sh
outputs/textto3d
βββ asset3d
β βββ sample3d_xx
β β βββ result
β β βββ mesh
β β β βββ material_0.png
β β β βββ material.mtl
β β β βββ sample3d_xx_collision.obj
β β β βββ sample3d_xx.glb
β β β βββ sample3d_xx_gs.ply
β β β βββ sample3d_xx.obj
β β βββ sample3d_xx.urdf
β β βββ video.mp4
βββ images
βββ sample3d_xx.png
βββ sample3d_xx_raw.png
```
- `mesh/` β 3D geometry and texture files for the asset, including visual mesh, collision mesh and 3DGS
- `*.urdf` β Simulator-ready URDF including collision and visual meshes
- `video.mp4` β Preview video of the generated 3D asset
- `images/sample3d_xx.png` β Foreground-extracted image used for image-to-3D step
- `images/sample3d_xx_raw.png` β Original generated image from the text-to-image step
---
!!! tip "Getting Started"
- You can also try Text-to-3D instantly online via our [Hugging Face Space](https://huggingface.co/spaces/HorizonRobotics/EmbodiedGen-Text-to-3D) β no installation required.
- Explore EmbodiedGen generated sim-ready [Assets Gallery](https://huggingface.co/spaces/HorizonRobotics/EmbodiedGen-Gallery-Explorer).
- For instructions on using the generated asset in any simulator, see [Any Simulators Tutorial](any_simulators.md).