perception.analyze

Operation: perception.analyze
Category: ai
Tags: perception, detection, segmentation, ocr, vision, image

Detect objects, segment instances, and extract text (OCR) from images using the Falcon-Perception vision model (0.3B parameters, runs locally).

Runtime

Type: Local Model (Falcon-Perception)
Timeout: 120s
Retries: 1

Inputs

Name	Type	Required	Default	Description
`image_path`	String	Yes	—	Path to image file
`perception_operations`	String[]	No	`["detect:object"]`	Operations to run (see below)

Operations

Operation	Description
`detect:<query>`	Detect objects matching the natural language query, returns bounding boxes
`segment:<query>`	Instance segmentation for the query, saves mask PNGs to staging dir
`ocr`	Extract all text from the image (plain mode)
`ocr:layout`	Layout-aware OCR with region coordinates, categories, and confidence

Outputs

Name	Type	Description
`_perception_result.detections`	Object[]	`[{"object": str, "bbox": {"x_min", "y_min", "x_max", "y_max"}}]`
`_perception_result.segments`	Object[]	`[{"object": str, "mask_path": str, "bbox": {...}}]`
`_perception_result.ocr`	Object \| null	`{"text": str}` or `{"text": str, "regions": [...]}` for layout mode

Installation

pip install 'fabric-workflow-sdk[local-perception]'

Usage

Python

from fabric_workflow_sdk.stages.perception import analyze_perception, extract_text

# Object detection
result = await analyze_perception({
    "image_path": "photo.jpg",
    "perception_operations": ["detect:person", "detect:car"],
})
detections = result["_perception_result"]["detections"]

# Instance segmentation
result = await analyze_perception({
    "image_path": "photo.jpg",
    "perception_operations": ["segment:person"],
})
mask_paths = [s["mask_path"] for s in result["_perception_result"]["segments"]]

# OCR (plain)
result = await extract_text({"image_path": "document.png"})
text = result["_perception_result"]["ocr"]["text"]

# OCR (layout-aware)
result = await analyze_perception({
    "image_path": "document.png",
    "perception_operations": ["ocr:layout"],
})
regions = result["_perception_result"]["ocr"]["regions"]

Example

{
  "image_path": "photo.jpg",
  "perception_operations": ["detect:cat", "ocr"]
}

Model

Model	Parameters	Size	License
Falcon-Perception-0.3B	0.3B	~600 MB	Apache 2.0

The model downloads automatically on first use. Supports CUDA (NVIDIA), MPS (Apple Silicon), and CPU.

Common Errors

Error	Cause	Fix
`ImportError: falcon_perception`	Package not installed	`pip install 'fabric-workflow-sdk[local-perception]'`
`No image_path in input`	Missing required input	Provide `image_path` in the input dict
`perception model set to 'skip'`	Quality profile disables perception	Use a local-capable profile (`free`, `local`, `local-quality`)