Providers
Fabric provides a provider-agnostic execution layer for local and remote AI capabilities.
Available Providers
Section titled “Available Providers”| Provider | Tier | Modalities | Streaming | Config |
|---|---|---|---|---|
| OpenAI | Premium | Text, Image (DALL-E), Embedding | Yes (SSE) | OPENAI_API_KEY |
| Anthropic | Premium | Text | No | ANTHROPIC_API_KEY |
| Ollama | Basic | Text, Embedding | Yes (NDJSON) | OLLAMA_ENABLED |
| Whisper | Basic | Audio | No | WHISPER_URL |
| ComfyUI | Basic | Image (Stable Diffusion) | No (async poll) | COMFYUI_ENABLED |
| ONNX | Basic | Configurable | No | ONNX_MODEL_DIR |
| Candle | Basic | Text, Embedding | No | CANDLE_ENABLED |
| Echo | Basic | All (test stub) | No | Always available |
Routing
Section titled “Routing”Requests specify a tier preference:
- basic — Local/free providers (Ollama, Whisper, Echo)
- premium — Remote/paid providers (OpenAI, Anthropic)
The router selects by:
- Tier — Match the requested tier
- Cost — Cheapest provider first within the tier
- Health — Skip unhealthy providers
- Capability — Provider must support the requested modality
If no tier is specified, the router selects by cost (cheapest first).
Providers declare capabilities with estimated_cost_per_request for cost-aware fallback.
Provider Priority
Section titled “Provider Priority”When multiple providers support the same modality, Fabric routes based on registration order:
- OpenAI (if
OPENAI_API_KEYset) - Anthropic (if
ANTHROPIC_API_KEYset) - Ollama (if
OLLAMA_ENABLEDorOLLAMA_URLset) - Whisper (if
WHISPER_URLset) - Echo providers (always available, fallback for testing)
To force a specific provider, include "model": "qwen3:latest" — the router matches the provider that advertises that model.
Endpoints
Section titled “Endpoints”List Providers
Section titled “List Providers”curl http://localhost:3001/v1/providersReturns all registered providers with their capabilities, modalities, and health status.
Execute
Section titled “Execute”curl -X POST http://localhost:3001/v1/providers/execute \ -H 'content-type: application/json' \ -d '{ "modality": "text", "model": "qwen3:latest", "input": {"prompt": "Hello world"}, "params": {"temperature": 0.7} }'Stream Execute (SSE)
Section titled “Stream Execute (SSE)”curl -N -X POST http://localhost:3001/v1/providers/execute/stream \ -H 'content-type: application/json' \ -d '{ "modality": "text", "input": {"prompt": "Write a poem"}, "params": {"stream": true} }'Cost Estimation
Section titled “Cost Estimation”curl -X POST http://localhost:3001/v1/providers/estimate \ -H 'content-type: application/json' \ -d '{ "modality": "text", "model": "gpt-4", "input": {"prompt": "Hello"} }'Local models return $0.00.
Configuration
Section titled “Configuration”OpenAI
Section titled “OpenAI”OPENAI_API_KEY=sk-...OPENAI_BASE_URL=https://api.openai.com/v1 # optional, for compatible serversAnthropic
Section titled “Anthropic”ANTHROPIC_API_KEY=sk-ant-...Ollama
Section titled “Ollama”OLLAMA_ENABLED=trueOLLAMA_URL=http://localhost:11434 # defaultWhisper
Section titled “Whisper”WHISPER_URL=http://localhost:8080ComfyUI
Section titled “ComfyUI”COMFYUI_ENABLED=trueCOMFYUI_URL=http://localhost:8188 # defaultONNX (Optional Feature)
Section titled “ONNX (Optional Feature)”Requires the onnx-runtime feature flag.
ONNX_MODEL_DIR=/path/to/modelsCandle (Optional Feature)
Section titled “Candle (Optional Feature)”Requires the candle feature flag.
CANDLE_ENABLED=trueOptional Compile-Time Features
Section titled “Optional Compile-Time Features”| Feature | Description |
|---|---|
onnx-runtime | Actual ONNX inference via ort |
candle | HuggingFace model loading via Candle |
wasm-runtime | Wasmtime plugin execution |
Enable with: cargo build --features onnx-runtime,candle,wasm-runtime