chore(docs): Update README. (#19)

chore(docs): Update README
This commit is contained in:
Xinjie 2025-07-04 16:30:22 +08:00 committed by GitHub
parent e8de0e44df
commit dd3720f880
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
5 changed files with 42 additions and 25 deletions

View File

@ -1,4 +1,4 @@
# EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence # *EmbodiedGen*: Towards a Generative 3D World Engine for Embodied Intelligence
[![🌐 Project Page](https://img.shields.io/badge/🌐-Project_Page-blue)](https://horizonrobotics.github.io/robot_lab/embodied_gen/index.html) [![🌐 Project Page](https://img.shields.io/badge/🌐-Project_Page-blue)](https://horizonrobotics.github.io/robot_lab/embodied_gen/index.html)
[![📄 arXiv](https://img.shields.io/badge/📄-arXiv-b31b1b)](https://arxiv.org/abs/2506.10600) [![📄 arXiv](https://img.shields.io/badge/📄-arXiv-b31b1b)](https://arxiv.org/abs/2506.10600)
@ -8,8 +8,8 @@
[![🤗 Hugging Face](https://img.shields.io/badge/🤗-Texture_Gen_Demo-blue)](https://huggingface.co/spaces/HorizonRobotics/EmbodiedGen-Texture-Gen) [![🤗 Hugging Face](https://img.shields.io/badge/🤗-Texture_Gen_Demo-blue)](https://huggingface.co/spaces/HorizonRobotics/EmbodiedGen-Texture-Gen)
**EmbodiedGen** is a toolkit to generate diverse and interactive 3D worlds composed of generative 3D assets with plausible physics, leveraging generative AI to address the challenges of generalization in embodied intelligence related research. EmbodiedGen composed of six key modules: `Image-to-3D`, `Text-to-3D`, `Texture Generation`, `Articulated Object Generation`, `Scene Generation` and `Layout Generation`. > ***EmbodiedGen*** is a generative engine to create diverse and interactive 3D worlds composed of high-quality 3D assets(mesh & 3DGS) with plausible physics, leveraging generative AI to address the challenges of generalization in embodied intelligence related research.
> It composed of six key modules: `Image-to-3D`, `Text-to-3D`, `Texture Generation`, `Articulated Object Generation`, `Scene Generation` and `Layout Generation`.
<img src="apps/assets/overall.jpg" alt="Overall Framework" width="700"/> <img src="apps/assets/overall.jpg" alt="Overall Framework" width="700"/>
@ -21,7 +21,7 @@
- [🎨 Texture Generation](#texture-generation) - [🎨 Texture Generation](#texture-generation)
- [🌍 3D Scene Generation](#3d-scene-generation) - [🌍 3D Scene Generation](#3d-scene-generation)
- [⚙️ Articulated Object Generation](#articulated-object-generation) - [⚙️ Articulated Object Generation](#articulated-object-generation)
- [🏞️ Layout Generation](#layout-generation) - [🏞️ Layout(Interactive 3D Worlds) Generation](#layout-generation)
## 🚀 Quick Start ## 🚀 Quick Start
@ -36,7 +36,7 @@ conda activate embodiedgen
bash install.sh bash install.sh
``` ```
### 🟢 Setup GPT Agent ### Setup GPT Agent
Update the API key in file: `embodied_gen/utils/gpt_config.yaml`. Update the API key in file: `embodied_gen/utils/gpt_config.yaml`.
@ -50,13 +50,13 @@ You can choose between two backends for the GPT agent:
<h2 id="image-to-3d">🖼️ Image-to-3D</h2> <h2 id="image-to-3d">🖼️ Image-to-3D</h2>
[![🤗 Hugging Face](https://img.shields.io/badge/🤗-Image_to_3D_Demo-blue)](https://huggingface.co/spaces/HorizonRobotics/EmbodiedGen-Image-to-3D) Generate physically plausible 3D asset from input image. [![🤗 Hugging Face](https://img.shields.io/badge/🤗-Image_to_3D_Demo-blue)](https://huggingface.co/spaces/HorizonRobotics/EmbodiedGen-Image-to-3D) Generate physically plausible 3D asset URDF from single input image, offering high-quality support for digital twin systems.
<img src="apps/assets/image_to_3d.jpg" alt="Image to 3D" width="900"> <img src="apps/assets/image_to_3d.jpg" alt="Image to 3D" width="900">
### Service ### ☁️ Service
Run the image-to-3D generation service locally. The first run will download required models. Run the image-to-3D generation service locally.
Models downloaded automatically on first run, please be patient.
```sh ```sh
# Run in foreground # Run in foreground
python apps/image_to_3d.py python apps/image_to_3d.py
@ -64,13 +64,12 @@ python apps/image_to_3d.py
CUDA_VISIBLE_DEVICES=0 nohup python apps/image_to_3d.py > /dev/null 2>&1 & CUDA_VISIBLE_DEVICES=0 nohup python apps/image_to_3d.py > /dev/null 2>&1 &
``` ```
### API ### ⚡ API
Generate a 3D model from an image using the command-line API. Generate physically plausible 3D assets from image input via the command-line API.
Models will be downloaded automatically, please wait for the first run.
```sh ```sh
python3 embodied_gen/scripts/imageto3d.py \ python3 embodied_gen/scripts/imageto3d.py \
--image_path apps/assets/example_image/sample_04.jpg apps/assets/example_image/sample_19.jpg \ --image_path apps/assets/example_image/sample_04.jpg apps/assets/example_image/sample_19.jpg \
--output_root outputs/imageto3d/ --output_root outputs/imageto3d
# See result(.urdf/mesh.obj/mesh.glb/gs.ply) in ${output_root}/sample_xx/result # See result(.urdf/mesh.obj/mesh.glb/gs.ply) in ${output_root}/sample_xx/result
``` ```
@ -84,19 +83,21 @@ python3 embodied_gen/scripts/imageto3d.py \
<img src="apps/assets/text_to_3d.jpg" alt="Text to 3D" width="900"> <img src="apps/assets/text_to_3d.jpg" alt="Text to 3D" width="900">
### Service ### ☁️ Service
Run the text-to-3D generation service locally. Deploy the text-to-3D generation service locally.
Text-to-image based on the Kolors model, supporting Chinese and English prompts.
Models downloaded automatically on first run, see `download_kolors_weights`, please be patient.
```sh ```sh
python apps/text_to_3d.py python apps/text_to_3d.py
``` ```
### API ### API
Models will be downloaded automatically, see `download_kolors_weights`. Text-to-image based on the Kolors model.
```sh ```sh
bash embodied_gen/scripts/textto3d.sh \ bash embodied_gen/scripts/textto3d.sh \
--prompts "small bronze figurine of a lion" "A globe with wooden base and latitude and longitude lines" "橙色电动手钻,有磨损细节" \ --prompts "small bronze figurine of a lion" "A globe with wooden base and latitude and longitude lines" "橙色电动手钻,有磨损细节" \
--output_root outputs/textto3d/ --output_root outputs/textto3d
``` ```
--- ---
@ -109,15 +110,14 @@ bash embodied_gen/scripts/textto3d.sh \
<img src="apps/assets/texture_gen.jpg" alt="Texture Gen" width="900"> <img src="apps/assets/texture_gen.jpg" alt="Texture Gen" width="900">
### Service ### ☁️ Service
Run the texture generation service locally. Run the texture generation service locally.
Models downloaded automatically on first run, see `download_kolors_weights`, `geo_cond_mv`.
```sh ```sh
python apps/texture_edit.py python apps/texture_edit.py
``` ```
### API ### ⚡ API
Models will be downloaded automatically, see `download_kolors_weights`, `geo_cond_mv`.
```sh ```sh
bash embodied_gen/scripts/texture_gen.sh \ bash embodied_gen/scripts/texture_gen.sh \
--mesh_path "apps/assets/example_texture/meshes/robot_text.obj" \ --mesh_path "apps/assets/example_texture/meshes/robot_text.obj" \
@ -147,7 +147,25 @@ bash embodied_gen/scripts/texture_gen.sh \
--- ---
<h2 id="layout-generation">🏞️ Layout Generation</h2> <h2 id="layout-generation">🏞️ Layout(Interactive 3D Worlds) Generation</h2>
### 💬 Generate Layout from task description
🚧 *Coming Soon*
<table>
<tr>
<td><img src="apps/assets/layout1.gif" alt="layout1" width="300"/></td>
<td><img src="apps/assets/layout2.gif" alt="layout2" width="300"/></td>
</tr>
<tr>
<td><img src="apps/assets/layout3.gif" alt="layout3" width="300"/></td>
<td><img src="apps/assets/layout4.gif" alt="layout4" width="300"/></td>
</tr>
</table>
### 🖼️ Generate Layout from image
🚧 *Coming Soon* 🚧 *Coming Soon*
@ -176,7 +194,6 @@ If you use EmbodiedGen in your research or projects, please cite:
EmbodiedGen builds upon the following amazing projects and models: EmbodiedGen builds upon the following amazing projects and models:
🌟 [Trellis](https://github.com/microsoft/TRELLIS) | 🌟 [Hunyuan-Delight](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-delight-v2-0) | 🌟 [Segment Anything](https://github.com/facebookresearch/segment-anything) | 🌟 [Rembg](https://github.com/danielgatis/rembg) | 🌟 [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4) | 🌟 [Stable Diffusion x4](https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler) | 🌟 [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) | 🌟 [Kolors](https://github.com/Kwai-Kolors/Kolors) | 🌟 [ChatGLM3](https://github.com/THUDM/ChatGLM3) | 🌟 [Aesthetic Score](http://captions.christoph-schuhmann.de/aesthetic_viz_laion_sac+logos+ava1-l14-linearMSE-en-2.37B.html) | 🌟 [Pano2Room](https://github.com/TrickyGo/Pano2Room) | 🌟 [Diffusion360](https://github.com/ArcherFMY/SD-T2I-360PanoImage) | 🌟 [Kaolin](https://github.com/NVIDIAGameWorks/kaolin) | 🌟 [diffusers](https://github.com/huggingface/diffusers) | 🌟 [gsplat](https://github.com/nerfstudio-project/gsplat) | 🌟 [QWEN2.5VL](https://github.com/QwenLM/Qwen2.5-VL) | 🌟 [GPT4o](https://platform.openai.com/docs/models/gpt-4o) 🌟 [Trellis](https://github.com/microsoft/TRELLIS) | 🌟 [Hunyuan-Delight](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-delight-v2-0) | 🌟 [Segment Anything](https://github.com/facebookresearch/segment-anything) | 🌟 [Rembg](https://github.com/danielgatis/rembg) | 🌟 [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4) | 🌟 [Stable Diffusion x4](https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler) | 🌟 [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) | 🌟 [Kolors](https://github.com/Kwai-Kolors/Kolors) | 🌟 [ChatGLM3](https://github.com/THUDM/ChatGLM3) | 🌟 [Aesthetic Score](http://captions.christoph-schuhmann.de/aesthetic_viz_laion_sac+logos+ava1-l14-linearMSE-en-2.37B.html) | 🌟 [Pano2Room](https://github.com/TrickyGo/Pano2Room) | 🌟 [Diffusion360](https://github.com/ArcherFMY/SD-T2I-360PanoImage) | 🌟 [Kaolin](https://github.com/NVIDIAGameWorks/kaolin) | 🌟 [diffusers](https://github.com/huggingface/diffusers) | 🌟 [gsplat](https://github.com/nerfstudio-project/gsplat) | 🌟 [QWEN2.5VL](https://github.com/QwenLM/Qwen2.5-VL) | 🌟 [GPT4o](https://platform.openai.com/docs/models/gpt-4o)
--- ---
## ⚖️ License ## ⚖️ License

BIN
apps/assets/layout1.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 684 KiB

BIN
apps/assets/layout2.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 713 KiB

BIN
apps/assets/layout3.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 642 KiB

BIN
apps/assets/layout4.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 699 KiB