EmboFlow/docs/plans/2026-03-30-project-dataset-template-design.md

124 lines
3.5 KiB
Markdown

# EmboFlow Project Dataset Template Design
## Goal
Define the next V1 product slice that turns the current runtime skeleton into a project-centric data workflow console with first-class datasets, storage connections, and workflow templates.
## Approved Boundary
- `Asset` remains the raw input object
- `Dataset` becomes a project-scoped first-class object
- `StorageConnection` becomes the place where datasets choose their persistence target
- `WorkflowTemplate` becomes the reusable authoring entrypoint for workflows
## Current Implementation Baseline
The current codebase already has:
- Mongo-backed `storage_connections`, `datasets`, `dataset_versions`, and `workflow_templates`
- HTTP endpoints for creating and listing those objects
- an asset page that already exposes storage connection and dataset creation forms
- a workflow editor with a large React Flow canvas, node drag and drop, edge creation, and Python code-hook editing
- workflow creation from blank definitions
The missing layer is product integration:
- project switching and project creation in the main shell
- a visible project workspace instead of a fixed bootstrap project
- workflow template selection on the workflows landing page
- template-based workflow creation as a first-class action
- saving an edited workflow as a reusable template
## Product Model
### Workspace
The workspace owns:
- projects
- storage connections
- workspace-scoped workflow templates
### Project
The project owns:
- assets
- datasets
- workflow definitions
- workflow runs
- project-scoped workflow templates
### Asset vs Dataset
- `Asset` is the raw import or registered source
- `Dataset` is the reusable project data product
- A dataset references one or more source assets and one storage connection
- Dataset versions remain immutable snapshots under the dataset
## UX Changes
### Header
The header should expose:
- workspace name
- active project selector
- quick create project action
- language switcher
### Projects Page
Add a dedicated projects page to:
- list existing projects
- create a new project
- switch the active project
- show lightweight counts for assets, datasets, workflows, and runs
### Assets Page
Keep the existing asset page as the project data hub:
- raw asset registration
- storage connection management
- dataset creation
- project asset list
### Workflows Page
Split the current workflows landing page into two clear entry paths:
- start from template
- start from blank workflow
Each template card should support:
- create workflow from template
- open the template-backed workflow after creation
### Workflow Editor
Keep the large canvas and runtime configuration model, and add:
- save current workflow as template
- explicit template name and description inputs for that action
- no reduction in current node-level editing power
## Implementation Rules
- do not replace the current `Asset` run binding model in this slice
- do not move storage connection management to a different backend model
- do not introduce a new visual framework for the canvas
- reuse current Mongo collections and runtime store methods where possible
## Success Criteria
The slice is done when:
1. users can create and switch projects without restarting bootstrap context
2. datasets are visibly project-scoped and backed by a chosen storage connection
3. workflows can be created either from a template or from a blank definition
4. edited workflows can be saved back as reusable templates
5. the canvas remains the primary authoring surface with runtime config and Python hook editing intact