116 lines
7.0 KiB
Markdown
116 lines
7.0 KiB
Markdown
# EmboFlow
|
|
|
|
EmboFlow is a B/S embodied-data workflow platform for raw asset ingestion, delivery normalization, dataset transformation, workflow execution, preview, and export.
|
|
|
|
## Current V1 Features
|
|
|
|
- Project-scoped workspace shell with a dedicated Projects page and active project selector in the header
|
|
- Asset workspace that supports local asset registration, probe summaries, storage connection management, and dataset creation
|
|
- Project-scoped custom node registry with Docker image and Dockerfile based node definitions
|
|
- Workflow templates as first-class objects, including default project templates and creating project workflows from a template
|
|
- Blank workflow creation and a large React Flow editor with drag-and-drop nodes, free canvas movement, edge validation, Docker-first node runtime presets, and Python code-hook injection
|
|
- Workflow-level `Save As Template` so edited graphs can be promoted into reusable project templates
|
|
- Mongo-backed run orchestration, worker execution, run history, task detail, logs, stdout/stderr, artifacts, cancel, retry, and task retry
|
|
- Runtime shell level Chinese and English switching
|
|
|
|
## Bootstrap
|
|
|
|
From the repository root:
|
|
|
|
```bash
|
|
make bootstrap
|
|
```
|
|
|
|
This installs workspace dependencies and runs `scripts/install_hooks.sh` so local commit and push guardrails are active.
|
|
|
|
## Local Commands
|
|
|
|
Run the full repository test suite:
|
|
|
|
```bash
|
|
make test
|
|
```
|
|
|
|
Run the strict repository guardrails:
|
|
|
|
```bash
|
|
make guardrails
|
|
```
|
|
|
|
Start package-level development entrypoints:
|
|
|
|
```bash
|
|
make dev-api
|
|
make dev-web
|
|
make dev-worker
|
|
```
|
|
|
|
## Local Deployment
|
|
|
|
Start MongoDB and MinIO:
|
|
|
|
```bash
|
|
make infra-up
|
|
```
|
|
|
|
Start the API and web app in separate terminals:
|
|
|
|
```bash
|
|
make serve-api
|
|
make serve-web
|
|
make serve-worker
|
|
```
|
|
|
|
The default local stack uses:
|
|
|
|
- API: `http://127.0.0.1:3001`
|
|
- Web: `http://127.0.0.1:3000`
|
|
- Worker: Mongo polling loop with `WORKER_POLL_INTERVAL_MS=1000`
|
|
|
|
### Local Data Validation
|
|
|
|
The local validation path currently used for embodied data testing is:
|
|
|
|
```text
|
|
/Users/longtaowu/workspace/emboldata/data
|
|
```
|
|
|
|
You can register that directory from the Assets page or via `POST /api/assets/register`.
|
|
The workflow editor currently requires selecting at least one registered asset before a run can be created.
|
|
The editor now also persists per-node runtime config in workflow versions, including executor overrides, optional artifact title overrides, and Python code-hook source for inspect and transform style nodes.
|
|
The runtime web shell now exposes a visible `中文 / English` language toggle. The core workspace shell and workflow authoring surface are translated through a lightweight i18n layer.
|
|
The shell now also exposes a dedicated Projects page plus an active project selector, so assets, datasets, workflow templates, workflows, and runs all switch together at the project boundary.
|
|
The Assets workspace now includes first-class storage connections and datasets. A dataset is distinct from a raw asset and binds project source assets to a selected local or object-storage-backed destination.
|
|
The shell now also exposes a dedicated Nodes page for project-scoped custom container nodes. Custom nodes can be registered from an existing Docker image or a self-contained Dockerfile, and each node declares whether it consumes a single asset set or multiple upstream asset sets plus what kind of output it produces.
|
|
The Workflows workspace now includes a template gallery. Projects can start from default or saved templates, or create a blank workflow directly.
|
|
The workflow editor center panel now uses a real draggable node canvas with zoom, pan, mini-map, dotted background, handle-based edge creation, persisted node positions, and localized validation feedback instead of a static list of node cards.
|
|
The workflow editor right panel now also supports saving the current workflow draft as a reusable workflow template, in addition to editing per-node runtime settings and Python hooks.
|
|
The node library now supports both click-to-append and drag-and-drop placement into the canvas. V1 connection rules block self-edges, duplicate edges, cycles, incoming edges into source nodes, outgoing edges from export nodes, and multiple upstream edges into ordinary nodes, while allowing multi-input set nodes such as `union-assets`, `intersect-assets`, and `difference-assets`.
|
|
The Runs workspace now shows project-scoped run history, run-level aggregated summaries, cancel/retry controls, and run detail views with persisted task summaries, stdout/stderr sections, result previews, and artifact links into Explore.
|
|
Selected run tasks now expose the frozen node definition id, executor config snapshot, and code-hook metadata that were captured when the run was created.
|
|
Most built-in delivery nodes now default to `executorType=docker`. When a node uses `executorType=docker` and provides `executorConfig.image`, the worker runs a real local Docker container with mounted `input.json` / `output.json` exchange files plus read-only mounts for bound asset paths. If no image is configured, the executor falls back to the lightweight simulated behavior used by older demo tasks.
|
|
Custom Docker nodes follow the same runtime contract. The container reads the task snapshot and execution context from `EMBOFLOW_INPUT_PATH`, writes `{\"result\": ...}` JSON to `EMBOFLOW_OUTPUT_PATH`, and if it declares an asset-set output contract it must return `result.assetIds` as a string array. Dockerfile-based custom nodes are built locally on first execution and then reused by tag.
|
|
When a node uses the built-in Python path without a custom hook, `source-asset` now emits bound asset metadata from Mongo-backed asset records and `validate-structure` now performs a real directory validation pass against local source paths. On the current sample path `/Users/longtaowu/workspace/emboldata/data`, that validation reports `valid=false`, `videoFileCount=407`, and missing delivery files because the sample root is a mixed dataset collection rather than a delivery package.
|
|
The worker now also carries direct upstream task results into execution context so set-operation utility nodes can compute narrowed asset sets and pass those effective asset ids to downstream tasks.
|
|
|
|
## Repository Structure
|
|
|
|
- `apps/api` contains the control-plane modules for workspaces, assets, workflows, runs, and artifacts.
|
|
- `apps/web` contains the React shell, asset workspace, workflow editor surface, run detail view, and explore renderers.
|
|
- `apps/worker` contains the Mongo-backed worker runtime, task runner, and executor contracts.
|
|
- `design/` contains the architecture and product design documents that must stay aligned with implementation.
|
|
- `docs/` contains workflow guidance and the executable implementation plan.
|
|
|
|
## Developer Workflow
|
|
|
|
1. Read the relevant design files under `design/` before editing code.
|
|
2. Implement code and update impacted docs in the same change set.
|
|
3. Use English-only commit messages with a gitmoji prefix.
|
|
4. Run `make test` and `make guardrails` before pushing changes.
|
|
|
|
For direct hook installation or reinstallation:
|
|
|
|
```bash
|
|
bash scripts/install_hooks.sh
|
|
```
|