global/vision-analyze
Vision analysis workflow — image understanding via Moondream (local) or Gemini (cloud).
Category: global
Source: workflows/ai/vision.py
Input Schema
Section titled “Input Schema”| Field | Type | Default | Description |
|---|---|---|---|
image_path | string | required | Path to the image file to analyze. |
regenerate | object | — | When set, this run is a regeneration. Workflows may read direction / keep / extra_instructions to modulate prompts; the engine persists parent_run_id and parent_variant_index as run lineage columns. |
variants | integer | 1 | Number of independent variant executions (1–10). When > 1, the engine runs the workflow N times with different sampling, producing N outputs. |
vision_operations | string[] | ["caption", "tags"] | Operations to perform. Options: ‘caption’, ‘tags’, ‘detect:<object>’, ‘query:<question>’. Example: [‘caption’, ‘tags’, ‘detect:person’] |
vision_question | object | — | Shorthand for a single visual QA question. Equivalent to adding ‘query:<question>’ to operations. |
Output Schema
Section titled “Output Schema”| Field | Type | Default | Description |
|---|---|---|---|
answers | object[] | — | Visual QA answers: [{question, answer}]. |
caption | object | — | Scene description / caption of the image. |
detections | object[] | — | Object detections with bounding boxes (local Moondream only). |
kind | object | — | Variant card shape: video / carousel / image / text. Surfaced on the per-variant entry of the run-output API and used by gallery UIs to pick the right layout. |
model_used | string | "" | Vision model that was used for analysis. |
tags | string[] | — | Searchable tags extracted from the image. |
Task Pipeline
Section titled “Task Pipeline”analyze_image → format_vision_output| Task | Description |
|---|---|
analyze_image | Analyze an image using the configured vision model. |
format_vision_output | Extract vision results into top-level output fields. |
Run-spec example
Section titled “Run-spec example”Save the YAML below as my-run.yaml, edit the values, and run with the CLI or POST it to the API. Required fields are uncommented; optional knobs are documented above the input: block — copy any line under input: and uncomment to set.
workflow: global/vision-analyze
# Optional fields — copy any line(s) under `input:` and uncomment to set:# Operations to perform. Options: 'caption', 'tags', 'detect:<object>', 'query:<question>'. Example: ['caption', 'tags', 'detect:person']# vision_operations: ["caption", "tags"]## Shorthand for a single visual QA question. Equivalent to adding 'query:<question>' to operations.# vision_question: null#
input: # Path to the image file to analyze. image_path: ""Run it locally:
fab-workflow --from-file my-run.yamlOr submit over the wire — the same file is the request body:
curl -X POST 'https://gofabric.dev/v1/workflows/runs?name=global/vision-analyze' \ -H 'Authorization: Bearer fab_xxx' \ -H 'content-type: application/yaml' \ --data-binary @my-run.yamlEvery workflow also accepts the universal WorkflowInput fields — variants (1–10 fan-out) and regenerate (creative-direction hints with run lineage). See Run-specs (YAML / TOML / JSON) for the full top-level shape (metadata, priority, bundle, parent, etc.).