Architecture Overview
Fabric is composed of two planes with distinct responsibilities, sharing a single Postgres database.
Control Plane (fabric serve, ports 3001/3002)
Section titled “Control Plane (fabric serve, ports 3001/3002)”Owns all canonical state and decision-making:
- Identity and principals
- Tenancy (organizations, teams, memberships, invitations)
- RBAC and effective permission computation
- Workflow definitions and orchestration state
- Provider routing decisions
- Cost estimation and policy enforcement
- Audit logs and system metadata
Exposes:
- HTTP REST API (axum, port 3001)
- gRPC API (tonic, port 3002)
- SSE event streaming
just run# or: cargo run -p fabricExecution Plane (fabric executor)
Section titled “Execution Plane (fabric executor)”Performs asynchronous work execution:
- Claims runnable workflow steps via Postgres (
SELECT ... FOR UPDATE SKIP LOCKED) - Executes steps through provider dispatch (Ollama, OpenAI, Anthropic, etc.)
- Updates step status and run context
- Handles retries, backoff, and timeouts
- Emits events for all state transitions
The execution plane:
- Does NOT own canonical state
- Reads and updates state through the shared Postgres database
- Can scale independently (run multiple workers)
- Is stateless aside from active leases
# Run a single workerDATABASE_URL=postgres://fabric:fabric@localhost:5432/fabric \OLLAMA_ENABLED=true \cargo run -p fabric-executor --bin executor
# Run for a specific workflow runRUN_ID=<uuid> cargo run -p fabric-executor --bin executorHow They Interact
Section titled “How They Interact”┌──────────────┐ ┌──────────────────┐│ Control Plane│ │ Execution Plane ││ (fabric) │ │ (fabric-executor) ││ │ │ ││ HTTP/gRPC API│ │ Worker Loop ││ Routes │ │ ↓ ││ Auth/Policy │ │ Claim Steps ││ │ │ ↓ ││ │ │ Execute (Ollama) ││ │ │ ↓ ││ │ │ Update Status │└──────┬───────┘ └────────┬──────────┘ │ │ │ ┌──────────┐ │ └────►│ Postgres │◄────────┘ │ │ │ workflow_ │ │ run_nodes │ │ (claims) │ └──────────┘Both planes share the same Postgres database. The control plane writes workflow definitions and creates runs. The execution plane claims steps, executes them, and writes results back.
Scaling
Section titled “Scaling”Control Plane
Section titled “Control Plane”Single instance, or behind a load balancer for HTTP. Stateless aside from database connections.
Execution Plane
Section titled “Execution Plane”Scale horizontally with N workers, each with a unique WORKER_ID:
- Workers use
FOR UPDATE SKIP LOCKED— they never double-execute - Leases expire automatically if a worker crashes
- New workers pick up abandoned work
- Each worker processes up to
WORKER_CONCURRENCYsteps per poll cycle
Executor Configuration
Section titled “Executor Configuration”| Env Var | Default | Description |
|---|---|---|
WORKER_ID | random UUID | Unique worker identity |
WORKER_CONCURRENCY | 4 | Max steps to claim per poll cycle |
WORKER_POLL_INTERVAL | 500ms | How often to check for work |
WORKER_LEASE_SECONDS | 30 | Lease TTL before auto-expiry |
WORKER_MAX_RETRIES | 3 | Max retry attempts per step |
Design Principles
Section titled “Design Principles”- Control plane owns truth — All canonical state lives in the control plane. The execution plane may only mutate execution-related state (runs, nodes, events).
- Shared database, separate binaries — Both planes use the same Postgres instance, but run as separate processes for independent scaling.
- Stateless workers — Executor workers hold no state beyond active leases. Crash recovery is automatic.
- Event-driven observability — Every significant state transition emits a domain event, enabling real-time monitoring via SSE/WebSocket/webhooks.