Skip to content

Shot Design & Visual Consistency

Fabric uses a cinematography vocabulary inspired by Seedance 2.0 Shot Design to produce cinema-grade video from AI models. Instead of vague adjectives like “cinematic” or “high quality”, prompts use specific physical parameters — focal length in millimeters, 3-layer lighting descriptions, and named film stocks — that video models can execute precisely.

This system lives in workflows/video/_shot_design.py and is reusable by any workflow that generates video prompts.

Most AI video prompts look like this:

A person walking through a city, cinematic, high quality, 4K, masterpiece

This produces generic, “AI-looking” video. The model has no physical parameters to work with — “cinematic” means nothing specific.

Fabric enriches every b-roll prompt with five layers of physical description:

Close-up of hands typing on a mechanical keyboard.
85mm portrait lens, slow deliberate push-in.
Soft diffused key light from window-right, gentle wrap-around fill,
warm rim separation on edges, golden-hour warmth with soft shadow falloff.
Shot on Kodak Portra 400, fine organic film grain,
individual hair strands catching backlight.
9:16 vertical framing.

The difference in output quality is dramatic. Physical parameters give the model concrete visual targets rather than abstract aesthetic goals.

Every b-roll segment has a narrative type that maps to a default cinematography preset:

TypeFocal LengthCamera MovementLightingFilm Stock
hook24mm widedolly push-indramatic sidelight, volumetric shafts, teal/amberKodak Vision3 500T
problem35mm standardhandheld drift, micro-shakeharsh overhead, deep shadows, cool blue-greyARRI ALEXA Mini LF
insight85mm portraitslow push-insoft window light, warm rim separation, golden-hourKodak Portra 400
solution50mm standardslow orbital (45 arc)balanced 3-point, god rays, warm amber35mm film grain
demo35mm standardsmooth trackingclean softbox, rim light, neutral toneOctane rendering
proof85mm portraitstatic + rack focusnatural window light, atmospheric haze, goldenFuji Pro 400H
cta24mm wideslow crane updramatic backlight, lens flare, high-contrast warmCinestill 800T

Each preset defines lighting as three components:

  1. Source — where the light comes from (e.g., “dramatic sidelight from screen-left”)
  2. Behavior — how the light interacts with the scene (e.g., “volumetric light shafts cutting through atmospheric haze”)
  3. Color Tone — the palette it creates (e.g., “high-contrast teal shadows with warm amber highlights”)

This maps directly to how cinematographers describe their lighting setups.

Every preset includes intentional imperfections that prevent the “too-clean AI look”:

  • Film grain (fine, subtle, or heavy depending on stock)
  • Dust motes drifting through light shafts
  • Lens flare on highlights
  • Bokeh circles in out-of-focus areas
  • Halation on bright edges (especially Cinestill 800T)
  • Natural optical vignetting

The assemble_cinematic_prompt() function enriches a raw b-roll description in 5 steps:

Raw input: "Person looking stressed at their desk"
1. Strip banned filler words (masterpiece, 4K, 8K, ultra HD, etc.)
2. Determine segment type (hook/problem/insight/solution/demo/proof/cta)
3. Check if LLM already included camera/lighting/material terms
4. If generic → layer in shot preset (focal length + camera + lighting + film stock)
5. Prepend continuity brief

These words actively degrade AI video quality and are automatically stripped:

masterpiece, 4K, 8K, ultra HD, best quality, highly detailed, photorealistic, ultra clear, ultra realistic, hyper realistic, super detailed

validate_broll_prompt() checks that a prompt contains all three categories:

  • Camera direction — focal length or camera movement term
  • Lighting — light source, behavior, or color tone term
  • Texture/material — film stock, bokeh, or organic imperfection term

If any category is missing, the assembler fills it in from the shot preset.

The continuity brief is a 1-2 sentence prefix prepended to every visual prompt in a video to maintain consistent aesthetics across all segments.

  1. Mood mapping — The video’s mood keyword (e.g., “high-energy”, “moody”, “documentary”) maps to a film stock and base palette:
MoodFilm StockBase Palette
high-energy, energeticKodak Vision3 500Twarm amber + teal shadow accents
conversational, casualKodak Portra 400warm neutral + soft golden highlights
serious, darkARRI ALEXA Mini LFdesaturated cool + neutral midtones
moody, nightCinestill 800Tcool blue-green + neon-warm highlights
cinematic, dramaticKodak Vision3 500Trich contrast + warm highlights + cool shadows
inspirationalKodak Portra 400golden warmth + soft diffused highlights
educational, documentaryARRI ALEXA Mini LFnaturalistic palette + muted tones
  1. Dominant segment type — The most frequent b-roll segment type determines the color tone from its shot preset.

  2. Output — A string like:

Unified palette: warm amber base with teal shadow accents.
Consistent Kodak Vision3 500T grain across all shots.
Atmospheric haze present in every frame.

This brief is stored as script["_continuity_brief"] and prepended to:

  • Every b-roll generation prompt
  • Every talking-head generation prompt

The continuity brief provides textual consistency (color grading, grain, atmosphere). The keyframe grid adds visual consistency (pixel-level lighting geometry, character details). Both systems work together — the grid prompt itself includes the continuity brief.

The module is importable from any workflow:

from workflows.video._shot_design import (
assemble_cinematic_prompt,
validate_broll_prompt,
strip_filler,
get_shot_preset,
generate_continuity_brief,
SEGMENT_SHOT_LIBRARY,
)
# Enrich a plain description
prompt = assemble_cinematic_prompt(
narration="The future of renewable energy",
broll_desc="Solar panels stretching across a desert landscape",
segment_type="solution",
segment_index=2,
total_segments=5,
continuity_brief="Unified palette: golden warmth...",
)
# Check prompt quality
passed, missing = validate_broll_prompt(prompt)
if not passed:
print(f"Missing: {missing}")
# Get raw presets
preset = get_shot_preset("hook")
print(preset["focal_length"]) # "24mm wide-angle lens"