EmboFlow/docs/plans/2026-03-30-project-dataset-template-design.md

3.5 KiB

EmboFlow Project Dataset Template Design

Goal

Define the next V1 product slice that turns the current runtime skeleton into a project-centric data workflow console with first-class datasets, storage connections, and workflow templates.

Approved Boundary

  • Asset remains the raw input object
  • Dataset becomes a project-scoped first-class object
  • StorageConnection becomes the place where datasets choose their persistence target
  • WorkflowTemplate becomes the reusable authoring entrypoint for workflows

Current Implementation Baseline

The current codebase already has:

  • Mongo-backed storage_connections, datasets, dataset_versions, and workflow_templates
  • HTTP endpoints for creating and listing those objects
  • an asset page that already exposes storage connection and dataset creation forms
  • a workflow editor with a large React Flow canvas, node drag and drop, edge creation, and Python code-hook editing
  • workflow creation from blank definitions

The missing layer is product integration:

  • project switching and project creation in the main shell
  • a visible project workspace instead of a fixed bootstrap project
  • workflow template selection on the workflows landing page
  • template-based workflow creation as a first-class action
  • saving an edited workflow as a reusable template

Product Model

Workspace

The workspace owns:

  • projects
  • storage connections
  • workspace-scoped workflow templates

Project

The project owns:

  • assets
  • datasets
  • workflow definitions
  • workflow runs
  • project-scoped workflow templates

Asset vs Dataset

  • Asset is the raw import or registered source
  • Dataset is the reusable project data product
  • A dataset references one or more source assets and one storage connection
  • Dataset versions remain immutable snapshots under the dataset

UX Changes

Header

The header should expose:

  • workspace name
  • active project selector
  • quick create project action
  • language switcher

Projects Page

Add a dedicated projects page to:

  • list existing projects
  • create a new project
  • switch the active project
  • show lightweight counts for assets, datasets, workflows, and runs

Assets Page

Keep the existing asset page as the project data hub:

  • raw asset registration
  • storage connection management
  • dataset creation
  • project asset list

Workflows Page

Split the current workflows landing page into two clear entry paths:

  • start from template
  • start from blank workflow

Each template card should support:

  • create workflow from template
  • open the template-backed workflow after creation

Workflow Editor

Keep the large canvas and runtime configuration model, and add:

  • save current workflow as template
  • explicit template name and description inputs for that action
  • no reduction in current node-level editing power

Implementation Rules

  • do not replace the current Asset run binding model in this slice
  • do not move storage connection management to a different backend model
  • do not introduce a new visual framework for the canvas
  • reuse current Mongo collections and runtime store methods where possible

Success Criteria

The slice is done when:

  1. users can create and switch projects without restarting bootstrap context
  2. datasets are visibly project-scoped and backed by a chosen storage connection
  3. workflows can be created either from a template or from a blank definition
  4. edited workflows can be saved back as reusable templates
  5. the canvas remains the primary authoring surface with runtime config and Python hook editing intact