EmboFlow/README.md

# EmboFlow

EmboFlow is a B/S embodied-data workflow platform for raw asset ingestion, delivery normalization, dataset transformation, workflow execution, preview, and export.

## Current V1 Features

- Project-scoped workspace shell with a dedicated Projects page and active project selector in the header
- Asset workspace that supports local asset registration, probe summaries, storage connection management, and dataset creation
- Workflow templates as first-class objects, including default project templates and creating project workflows from a template
- Blank workflow creation and a large React Flow editor with drag-and-drop nodes, free canvas movement, edge validation, Docker-first node runtime presets, and Python code-hook injection
- Workflow-level `Save As Template` so edited graphs can be promoted into reusable project templates
- Mongo-backed run orchestration, worker execution, run history, task detail, logs, stdout/stderr, artifacts, cancel, retry, and task retry
- Runtime shell level Chinese and English switching

## Bootstrap

From the repository root:

```bash
make bootstrap
```

This installs workspace dependencies and runs `scripts/install_hooks.sh` so local commit and push guardrails are active.

## Local Commands

Run the full repository test suite:

```bash
make test
```

Run the strict repository guardrails:

```bash
make guardrails
```

Start package-level development entrypoints:

```bash
make dev-api
make dev-web
make dev-worker
```

## Local Deployment

Start MongoDB and MinIO:

```bash
make infra-up
```

Start the API and web app in separate terminals:

```bash
make serve-api
make serve-web
make serve-worker
```

The default local stack uses:

- API: `http://127.0.0.1:3001`
- Web: `http://127.0.0.1:3000`
- Worker: Mongo polling loop with `WORKER_POLL_INTERVAL_MS=1000`

### Local Data Validation

The local validation path currently used for embodied data testing is:

```text
/Users/longtaowu/workspace/emboldata/data
```

You can register that directory from the Assets page or via `POST /api/assets/register`.
The workflow editor currently requires selecting at least one registered asset before a run can be created.
The editor now also persists per-node runtime config in workflow versions, including executor overrides, optional artifact title overrides, and Python code-hook source for inspect and transform style nodes.
The runtime web shell now exposes a visible `中文 / English` language toggle. The core workspace shell and workflow authoring surface are translated through a lightweight i18n layer.
The shell now also exposes a dedicated Projects page plus an active project selector, so assets, datasets, workflow templates, workflows, and runs all switch together at the project boundary.
The Assets workspace now includes first-class storage connections and datasets. A dataset is distinct from a raw asset and binds project source assets to a selected local or object-storage-backed destination.
The Workflows workspace now includes a template gallery. Projects can start from default or saved templates, or create a blank workflow directly.
The workflow editor center panel now uses a real draggable node canvas with zoom, pan, mini-map, dotted background, handle-based edge creation, persisted node positions, and localized validation feedback instead of a static list of node cards.
The workflow editor right panel now also supports saving the current workflow draft as a reusable workflow template, in addition to editing per-node runtime settings and Python hooks.
The node library now supports both click-to-append and drag-and-drop placement into the canvas. V1 connection rules block self-edges, duplicate edges, cycles, incoming edges into source nodes, outgoing edges from export nodes, and multiple upstream edges into ordinary nodes, while allowing multi-input set nodes such as `union-assets`, `intersect-assets`, and `difference-assets`.
The Runs workspace now shows project-scoped run history, run-level aggregated summaries, cancel/retry controls, and run detail views with persisted task summaries, stdout/stderr sections, result previews, and artifact links into Explore.
Selected run tasks now expose the frozen node definition id, executor config snapshot, and code-hook metadata that were captured when the run was created.
Most built-in delivery nodes now default to `executorType=docker`. When a node uses `executorType=docker` and provides `executorConfig.image`, the worker runs a real local Docker container with mounted `input.json` / `output.json` exchange files plus read-only mounts for bound asset paths. If no image is configured, the executor falls back to the lightweight simulated behavior used by older demo tasks.
When a node uses the built-in Python path without a custom hook, `source-asset` now emits bound asset metadata from Mongo-backed asset records and `validate-structure` now performs a real directory validation pass against local source paths. On the current sample path `/Users/longtaowu/workspace/emboldata/data`, that validation reports `valid=false`, `videoFileCount=407`, and missing delivery files because the sample root is a mixed dataset collection rather than a delivery package.
The worker now also carries direct upstream task results into execution context so set-operation utility nodes can compute narrowed asset sets and pass those effective asset ids to downstream tasks.

## Repository Structure

- `apps/api` contains the control-plane modules for workspaces, assets, workflows, runs, and artifacts.
- `apps/web` contains the React shell, asset workspace, workflow editor surface, run detail view, and explore renderers.
- `apps/worker` contains the Mongo-backed worker runtime, task runner, and executor contracts.
- `design/` contains the architecture and product design documents that must stay aligned with implementation.
- `docs/` contains workflow guidance and the executable implementation plan.

## Developer Workflow

1. Read the relevant design files under `design/` before editing code.
2. Implement code and update impacted docs in the same change set.
3. Use English-only commit messages with a gitmoji prefix.
4. Run `make test` and `make guardrails` before pushing changes.

For direct hook installation or reinstallation:

```bash
bash scripts/install_hooks.sh
```