Workflow Sets

A workflow set is a YAML / TOML / JSON manifest that names N runs, declares shared defaults, optional inter-run wiring via output aliases, and how the runs should execute (sequential vs parallel, error semantics).

Sets are an operator/CLI ergonomic — they execute via fab-workflow --run-set FILE. There is no POST /v1/workflow-sets/runs endpoint; the HTTP API still submits individual workflow runs (see Submitting Jobs). Use sets when you want one declarative document to drive a multi-stage pipeline locally or in CI.

Quick start

# Run the whole set
fab-workflow --run-set examples/workflow-sets/chained.yaml -o /tmp/out

# Run only a specific stage (its dependencies are auto-pulled)
fab-workflow --run-set examples/workflow-sets/chained.yaml short -o /tmp/out

# Validate + print the resolution plan without executing
fab-workflow --run-set examples/workflow-sets/chained.yaml --dry-run

# Enumerate stages and their dependencies
fab-workflow --run-set examples/workflow-sets/chained.yaml --list

# Run independent stages concurrently (deps still respected)
fab-workflow --run-set examples/workflow-sets/chained.yaml --parallel

# Reuse persisted upstream outputs from a prior run instead of re-running them
fab-workflow --run-set examples/workflow-sets/chained.yaml short \
    -o /tmp/out --use-prior-outputs research --use-prior-outputs hooks

Manifest format

# yaml-language-server: $schema=../../schemas/workflow-set.schema.json

set: my-batch                    # required — set identifier
description: Optional human description.

shared_input:                    # merged INTO each run's input (run-level wins)
  quality: cheap

runs:
  # 1. Inline run.
  - name: research
    workflow: global/deep-research
    input: { query: "AI productivity" }

  # 2. Run with inter-run wiring.
  - name: hooks
    workflow: hooks/generate
    inputs_from:
      # source = "<run>.<variant>.<alias>" (variant defaults to 0).
      # The alias resolves against the upstream's declared
      # `output_aliases` or, if none declared, its top-level output
      # fields. Renaming an alias is the only thing that breaks
      # downstream chains.
      - { source: research.0.themes, map: themes, optional: true }
    input: { niche: "AI productivity" }

  # 3. Reference an existing run-spec file (single run inlined).
  - from: ../run-inputs/talking-head-builder-smoke.yaml

  # 4. Reference another workflow-set file. All its runs are inlined,
  #    optionally namespaced under a prefix and filtered with `select:`.
  - include: ./suites/short-form-suite.yaml
    prefix: shortform              # bare string; joined to run names with `.`
    select: [ai-shorts, hooks]     # optional — filter included runs

parallel: false                    # default; true = layered concurrent
on_error: stop                     # stop | continue

# Optional set-level outputs — public aliases parents reference via
# `<prefix>.outputs.<alias>` after `include:`.
outputs:
  primary_hook: hooks.0.hook
  final_video:  short.0.video_path

The # yaml-language-server: $schema= magic comment activates editor autocomplete and validation in VS Code (with the YAML extension) and JetBrains. The schema is generated from the WorkflowSet Pydantic model — runtime and editor validation stay in lockstep.

Selecting stages on the CLI

The selection model is positional: name the stages after the manifest file. No positional → run the whole set (default).

# Default — runs the entire set
fab-workflow --run-set chained.yaml

# Same, explicit
fab-workflow --run-set chained.yaml --all

# Run only `short`. `research` and `hooks` are auto-included as deps.
fab-workflow --run-set chained.yaml short

# Run multiple stages
fab-workflow --run-set chained.yaml hooks short

# Run a single stage WITHOUT its upstreams (debugging only —
# `inputs_from` will fail unless you also pass --use-prior-outputs).
fab-workflow --run-set chained.yaml short --no-deps

--all and positional stage names are mutually exclusive — pass one or the other.

Output aliases (the public chaining surface)

Each WorkflowOutput subclass MAY declare a class-level output_aliases dict mapping a stable, semantic public name to a dot-path into the dumped output dict:

class HookGenerationOutput(WorkflowOutput):
    hook_ideas: list[dict] = Field(default_factory=list, ...)
    output_aliases = {
        "hook":      "hook_ideas.0.hook",   # primary hook for chaining
        "all_hooks": "hook_ideas",
    }

Workflows that don’t declare any get implicit aliases — every top-level field on the output model is auto-aliased to itself (foo → "foo"). Aliases are serialized into workflows/registry.lock.json next to each workflow’s output_schema.

Discovering available aliases

# Print the input + output schemas (already shipped)
fab-workflow --lint hooks/generate

# Print only the output section: declared aliases + their resolved
# paths + the underlying schema for reference
fab-workflow --scaffold hooks/generate --show-output

Example output of --show-output:

workflow: hooks/generate

# Output aliases (use these in workflow-set `inputs_from`):
#   hook       →  hook_ideas.0.hook
#   all_hooks  →  hook_ideas

# Underlying output schema (for reference):
#   kind: string — Variant card shape: video / carousel / image / text.
#   hook_ideas: array — Generated hook ideas
#   workflow: string — Workflow identifier

Cross-file composition

Two patterns:

from: on a WorkflowSetRun — point at a single run-spec file (see examples/run-inputs/). The single run is inlined at that position.
include: on a WorkflowSetRun — point at another workflow-set file. All of its runs are inlined at that position with their wiring intact.

runs:
  # Pull in another set's runs, namespaced under "shortform":
  - include: ./suites/short-form-suite.yaml
    prefix: shortform
    select: [ai-shorts, hooks]   # optional filter

  - name: post-summary
    workflow: global/summarize-runs
    inputs_from:
      # The included set's `ai-shorts` is now `shortform.ai-shorts`:
      - { source: shortform.ai-shorts.0.video_path, map: video }

Rules:

prefix: is a bare string (no trailing separator) joined with . to included run names. The source resolver does longest-prefix matching so prefixed names containing dots (shortform.ai-shorts) parse unambiguously. Without a prefix, name collisions across the parent and the include error at load time.
select: filters the included set’s runs by name (transitive needs: auto-pulled).
shared_input merging: parent’s shared_input ∪ included’s shared_input (parent wins on conflicts).
Recursion: includes can include. Cycles detected at load time; bounded depth (4) catches accidental fan-out.
Path resolution: include: and from: paths are relative to the file that mentions them, not the CWD.

Resolution rules

When you write inputs_from:

- { source: hooks.0.hook, map: topic, optional: false }

The runner:

Splits the source as <run>.<variant>.<alias> (variant defaults to 0).
Looks up the upstream run’s resolved alias map (built from its declared or implicit output_aliases).
If the alias resolves: assigns the value to map on the downstream run’s input.
If the alias is missing: fail-loud unless optional: true.
needs: is inferred from inputs_from sources. Author writes needs: explicitly only for barrier-only deps with no data flow.

Effective input merge order, lowest → highest priority:

shared_input
resolved inputs_from
inline input: block (static values win)

Cycles in the inferred needs: graph are rejected at load time with the cycle path printed.

Execution model

run_workflow_set topologically sorts the resolved runs from the inferred dependency graph. Declared file order is the tiebreaker for independent runs.

parallel: false (default) — topo order, sequential.
parallel: true — Kahn-style layered execution: each layer is a set of runs whose deps are all satisfied; layers run with asyncio.gather (bounded concurrency, default 4); the next layer starts when the previous fully completes.

Outputs from each completed run are kept in memory (with the alias map computed on top) so downstream inputs_from resolves without round-tripping to disk. Full results are also persisted to <output_dir>/<run.name>/output.json when -o is given.

on_error: stop (default) halts the whole set on the first failure. on_error: continue records failures into SetResult.failures and keeps going.

—dry-run resolution plan

--dry-run validates the manifest, runs the dep-closure step, and prints the merged effective input dict + inputs_from arrows for each stage without executing anything. Answers “what would actually run?” without spending API budget:

Resolution plan for chained-batch (3 runs):

  research  (global/deep-research)
    deps: —
    input: {"quality": "cheap", "query": "AI productivity tools", "depth": 3}

  hooks  (hooks/generate)
    deps: research
    input: {"quality": "cheap", "niche": "AI productivity"}
      ← inputs_from: research.0.themes  →  themes

  short  (video/quick-shorts)
    deps: hooks
    input: {"quality": "cheap", "duration_secs": 30}
      ← inputs_from: hooks.0.hook  →  topic

Failure recovery

When a stage fails inside a set, the runner prints the exact retry command:

✓ research      (12.3s)
✓ hooks         (4.1s)
✗ short         FAILED — VideoGenerationError: timeout

To retry just the failed stage (upstream outputs reused from /tmp/out):
    fab-workflow --run-set chained.yaml short --output /tmp/out --use-prior-outputs <upstream>

--use-prior-outputs STAGE (repeatable) reads <output_dir>/<stage>/output.json from a prior run and feeds it into the alias resolver instead of re-executing. Late-stage iteration without re-paying for early stages.

Scaffolding a starter set

# Emit a starter manifest with the named workflows pre-wired as runs.
# `inputs_from` is left empty for the operator to fill in.
fab-workflow --scaffold-set hooks/generate video/quick-shorts > my-set.yaml
fab-workflow --scaffold-set hooks/generate video/quick-shorts -o my-set.toml --format toml

Where to look in the repo

Plan: specs/plans/078-workflow-scaffolding-and-sets.md
Examples: examples/workflow-sets/ — smoke.yaml, chained.yaml, plus the README.md format reference.
Schema: schemas/workflow-set.schema.json — generated by scripts/generate-workflow-set-schema.py.
Source: sdks/fabric-workflow-sdk/fabric_workflow_sdk/workflow_set.py (models, loader, runner) and sdks/fabric-workflow-sdk/fabric_workflow_sdk/scaffold_run_spec.py (template renderer).
CLI plumbing: workflows/cli.py — _scaffold_command, _scaffold_set_command, _run_set_command.
Tests: sdks/fabric-workflow-sdk/tests/test_scaffold_run_spec.py and sdks/fabric-workflow-sdk/tests/test_workflow_set.py — 38 cases covering scaffolder round-trips, source parsing with longest-prefix matching, dependency inference + topo sort, cycle detection, selection, alias resolution, cross-file include + prefix rewriting, sequential + parallel execution, on_error semantics, and --use-prior-outputs.

Future work

POST /v1/workflow-sets/runs (server-side set execution) — if/when the dashboard wants batched runs. Today, sets are CLI-only.
--save-set symmetric writer — capture a sequence of completed runs as a fresh set manifest.
Workflow-version pinning in scaffolded files (# fabric: video/ai-shorts v0.5.1 magic comment).
Rust binary integration (fabric run-set FILE).