Skip to content

VFX Compositing

Fabric provides three AI-powered compositing capabilities that collapse traditional multi-step VFX pipelines into single inference calls. All three use Seedance 2.0’s reference-to-video system under the hood, with automatic fallbacks when Seedance is unavailable.

These capabilities work as standalone SDK stages, as tasks within the long-form and ai-shorts pipelines, or as building blocks for custom workflows.

┌─────────────────────┐
│ Multi-View Character │
│ generate 4 angles │
└─────────┬─────────────┘
┌─────────────────────┼─────────────────────┐
│ │ │
Green Screen Layer Editing B-Roll Generation
footage + bg base video + elements (enhanced refs)
→ composite → sequential apply → ref2v with 4 views
│ │ │
└─────────────────────┼─────────────────────┘
Final Video

Traditional character consistency in AI video uses a single reference image. Multi-view generation produces 4 perspective views (front, 3/4, profile, back) from a single character photo, giving Seedance dramatically stronger consistency signals across different camera angles and shots.

  1. A single character photo is passed to Seedance ref2v with angle-specific prompts
  2. The first frame of each generated video is extracted as a still image
  3. All 4 views are uploaded to fal CDN and stored as _multiview_character_refs
  4. Downstream b-roll and avatar calls pass all 4 views as role: "subject" references
  5. Seedance uses @Image1-@Image4 role assignments for each angle

Multi-view is automatic in the ai-shorts pipeline when using premium/ultra quality and an actor image is provided. No additional configuration needed.

Terminal window
# Automatic: premium/ultra with actor image triggers multi-view
fab ai-shorts "AI replacing developers" \
--quality ultra \
--actor-image ./my_actor.jpg
# Long-form also benefits
fab video/long-form "The Metaverse" \
--quality premium \
--presenter-look "young tech executive in a navy blazer"
ApproachReferencesConsistencyCost
Single ref (default)1 character imageGood for front-facing shots1 image gen
Multi-view (premium/ultra)4 character imagesStrong across all angles~$0.20 (4 x $0.05)

Multi-view is most impactful for videos with diverse camera angles, multiple scenes, or character movement. For static talking-head videos, single reference is sufficient.

Upload green screen footage and a background image — Seedance keys out the green screen, preserves the original camera motion, and composites the subject onto the new background in one step.

  1. Foreground video is passed as a motion reference (camera trajectory preservation)
  2. Background image is passed as an environment reference
  3. Seedance handles keying, edge preservation, and lighting integration
  4. If Seedance fails, falls back to ffmpeg chromakey filter + overlay composite
Terminal window
# Long-form video with green screen footage
fab video/long-form "The History of Visual Effects" \
--quality premium \
--greenscreen-footage-url "https://cdn.example.com/presenter_greenscreen.mp4"

When greenscreen_footage_url is provided to the long-form pipeline:

  1. generate_scene_backgrounds — generates one background image per act from the script’s visual cues
  2. produce_chapters — for each chapter, composites the green screen footage with the act’s background instead of generating b-roll
  3. Falls back to standard b-roll generation for any chapter where compositing fails
Seedance ref2v (motion + environment refs)
→ ffmpeg chromakey + overlay composite
→ Seedance i2v with background as start frame

Add visual elements (vehicles, weather effects, objects) from reference images into existing footage. Each element is placed at a specific spatial position while the original camera work is preserved.

  1. Base video is passed as a motion reference (camera anchoring)
  2. Each element image is passed as a typed reference (subject, weather, vehicle, environment)
  3. Spatial placement directives tell Seedance where to integrate each element
  4. Layers are applied sequentially by default for highest quality — each layer builds on the previous result
Terminal window
# Long-form with scene layers
fab video/long-form "Future of Transportation" \
--quality premium \
--scene-layers '[
{"url": "https://example.com/cybertruck.jpg", "role": "vehicle", "placement": "background right", "description": "silver cybertruck"},
{"url": "https://example.com/snow.jpg", "role": "weather", "placement": "atmospheric", "description": "light falling snow"}
]'

Each layer is a dict with these fields:

FieldTypeRequiredDescription
urlstringyesImage URL of the element to add
rolestringyessubject, vehicle, weather, or environment
placementstringnoSpatial directive: foreground left, background center, atmospheric, etc.
descriptionstringyesWhat the element is: red cybertruck, falling snow, neon sign

By default, layers are applied sequentially — each layer sees the result of all prior layers, producing the most coherent composite. For workflows with many layers, consider:

ModeQualitySpeedWhen to use
Sequential (default)HighestSlower (N calls)1-3 layers, quality-critical
Parallel (planned)GoodFaster (1 call + composite)4+ layers, speed-critical

Parallel layer application will be available when sub-graph fan-out ships (Phase 33).

OperationModelCost per call
Multi-view (per view)bytedance/seedance-2.0/multiview$0.05
Multi-view (4x bundle)seedance/multiview-4x$0.20
Green screen compositebytedance/seedance-2.0/greenscreen-composite$0.15
Green screen (fast)bytedance/seedance-2.0/fast/greenscreen-composite$0.08
Layer editingbytedance/seedance-2.0/layer-edit$0.15
Layer editing (fast)bytedance/seedance-2.0/fast/layer-edit$0.08
ffmpeg fallbacklocal$0.00
WorkflowMulti-ViewGreen ScreenLayersTotal Added
ai-shorts (ultra)$0.20+$0.20
long-form 5ch (premium + greenscreen)$0.205 x $0.15+$0.95
long-form 5ch (premium + 2 layers)$0.205 x 2 x $0.15+$1.70

These features are available at the premium and ultra quality tiers:

TierMulti-ViewGreen ScreenLayer Editing
free/budget/cheap
standard
premiumautomaticvia inputvia input
ultraautomaticvia inputvia input

Typed stage I/O contracts are defined in proto/fabric/stages/v1/stages.proto:

  • MultiViewInput / MultiViewOutput / MultiViewImage
  • GreenScreenInput / GreenScreenOutput
  • LayerEditInput / LayerEditOutput / SceneLayer

These contracts are optional — the API always accepts arbitrary JSON — but provide autocompletion and type checking in SDKs.