Skip to content

global/vision-analyze

Vision analysis workflow — image understanding via Moondream (local) or Gemini (cloud).

Category: global
Source: workflows/ai/vision.py

FieldTypeDefaultDescription
image_pathstringrequiredPath to the image file to analyze.
regenerateobjectWhen set, this run is a regeneration. Workflows may read direction / keep / extra_instructions to modulate prompts; the engine persists parent_run_id and parent_variant_index as run lineage columns.
variantsinteger1Number of independent variant executions (1–10). When > 1, the engine runs the workflow N times with different sampling, producing N outputs.
vision_operationsstring[]["caption", "tags"]Operations to perform. Options: ‘caption’, ‘tags’, ‘detect:<object>’, ‘query:<question>’. Example: [‘caption’, ‘tags’, ‘detect:person’]
vision_questionobjectShorthand for a single visual QA question. Equivalent to adding ‘query:<question>’ to operations.
FieldTypeDefaultDescription
answersobject[]Visual QA answers: [{question, answer}].
captionobjectScene description / caption of the image.
detectionsobject[]Object detections with bounding boxes (local Moondream only).
kindobjectVariant card shape: video / carousel / image / text. Surfaced on the per-variant entry of the run-output API and used by gallery UIs to pick the right layout.
model_usedstring""Vision model that was used for analysis.
tagsstring[]Searchable tags extracted from the image.
analyze_image → format_vision_output
TaskDescription
analyze_imageAnalyze an image using the configured vision model.
format_vision_outputExtract vision results into top-level output fields.

Save the YAML below as my-run.yaml, edit the values, and run with the CLI or POST it to the API. Required fields are uncommented; optional knobs are documented above the input: block — copy any line under input: and uncomment to set.

workflow: global/vision-analyze
# Optional fields — copy any line(s) under `input:` and uncomment to set:
# Operations to perform. Options: 'caption', 'tags', 'detect:<object>', 'query:<question>'. Example: ['caption', 'tags', 'detect:person']
# vision_operations: ["caption", "tags"]
#
# Shorthand for a single visual QA question. Equivalent to adding 'query:<question>' to operations.
# vision_question: null
#
input:
# Path to the image file to analyze.
image_path: ""

Run it locally:

Terminal window
fab-workflow --from-file my-run.yaml

Or submit over the wire — the same file is the request body:

Terminal window
curl -X POST 'https://gofabric.dev/v1/workflows/runs?name=global/vision-analyze' \
-H 'Authorization: Bearer fab_xxx' \
-H 'content-type: application/yaml' \
--data-binary @my-run.yaml

Every workflow also accepts the universal WorkflowInput fields — variants (1–10 fan-out) and regenerate (creative-direction hints with run lineage). See Run-specs (YAML / TOML / JSON) for the full top-level shape (metadata, priority, bundle, parent, etc.).