Providers

Fabric provides a provider-agnostic execution layer for local and remote AI capabilities.

Available Providers

Provider	Tier	Modalities	Streaming	Config
OpenAI	Premium	Text, Image (DALL-E), Embedding	Yes (SSE)	`OPENAI_API_KEY`
Anthropic	Premium	Text	No	`ANTHROPIC_API_KEY`
Ollama	Basic	Text, Embedding	Yes (NDJSON)	`OLLAMA_ENABLED`
Whisper	Basic	Audio	No	`WHISPER_URL`
ComfyUI	Basic	Image (Stable Diffusion)	No (async poll)	`COMFYUI_ENABLED`
ONNX	Basic	Configurable	No	`ONNX_MODEL_DIR`
Candle	Basic	Text, Embedding	No	`CANDLE_ENABLED`
Echo	Basic	All (test stub)	No	Always available

Routing

Requests specify a tier preference:

basic — Local/free providers (Ollama, Whisper, Echo)
premium — Remote/paid providers (OpenAI, Anthropic)

The router selects by:

Tier — Match the requested tier
Cost — Cheapest provider first within the tier
Health — Skip unhealthy providers
Capability — Provider must support the requested modality

If no tier is specified, the router selects by cost (cheapest first).

Providers declare capabilities with estimated_cost_per_request for cost-aware fallback.

Provider Priority

When multiple providers support the same modality, Fabric routes based on registration order:

OpenAI (if OPENAI_API_KEY set)
Anthropic (if ANTHROPIC_API_KEY set)
Ollama (if OLLAMA_ENABLED or OLLAMA_URL set)
Whisper (if WHISPER_URL set)
Echo providers (always available, fallback for testing)

To force a specific provider, include "model": "qwen3:latest" — the router matches the provider that advertises that model.

Endpoints

List Providers

curl http://localhost:3001/v1/providers

Returns all registered providers with their capabilities, modalities, and health status.

Execute

curl -X POST http://localhost:3001/v1/providers/execute \
  -H 'content-type: application/json' \
  -d '{
    "modality": "text",
    "model": "qwen3:latest",
    "input": {"prompt": "Hello world"},
    "params": {"temperature": 0.7}
  }'

Stream Execute (SSE)

curl -N -X POST http://localhost:3001/v1/providers/execute/stream \
  -H 'content-type: application/json' \
  -d '{
    "modality": "text",
    "input": {"prompt": "Write a poem"},
    "params": {"stream": true}
  }'

Cost Estimation

curl -X POST http://localhost:3001/v1/providers/estimate \
  -H 'content-type: application/json' \
  -d '{
    "modality": "text",
    "model": "gpt-4",
    "input": {"prompt": "Hello"}
  }'

Local models return $0.00.

Configuration

OpenAI

OPENAI_API_KEY=sk-...
OPENAI_BASE_URL=https://api.openai.com/v1  # optional, for compatible servers

Anthropic

ANTHROPIC_API_KEY=sk-ant-...

Ollama

OLLAMA_ENABLED=true
OLLAMA_URL=http://localhost:11434  # default

Whisper

WHISPER_URL=http://localhost:8080

ComfyUI

COMFYUI_ENABLED=true
COMFYUI_URL=http://localhost:8188  # default

ONNX (Optional Feature)

Requires the onnx-runtime feature flag.

ONNX_MODEL_DIR=/path/to/models

Candle (Optional Feature)

Requires the candle feature flag.

CANDLE_ENABLED=true

Optional Compile-Time Features

Feature	Description
`onnx-runtime`	Actual ONNX inference via `ort`
`candle`	HuggingFace model loading via Candle
`wasm-runtime`	Wasmtime plugin execution

Enable with: cargo build --features onnx-runtime,candle,wasm-runtime