Shot Design & Visual Consistency

Fabric uses a cinematography vocabulary inspired by Seedance 2.0 Shot Design to produce cinema-grade video from AI models. Instead of vague adjectives like “cinematic” or “high quality”, prompts use specific physical parameters — focal length in millimeters, 3-layer lighting descriptions, and named film stocks — that video models can execute precisely.

This system lives in workflows/video/_shot_design.py and is reusable by any workflow that generates video prompts.

The Problem with Generic Prompts

Most AI video prompts look like this:

A person walking through a city, cinematic, high quality, 4K, masterpiece

This produces generic, “AI-looking” video. The model has no physical parameters to work with — “cinematic” means nothing specific.

The Solution: Physical Parameters

Fabric enriches every b-roll prompt with a 7-element formula of physical description:

Close-up of hands typing on a mechanical keyboard,
fingers drumming between keystrokes, subtle shift in posture.
85mm portrait lens, slow deliberate push-in.
Monitor glow casting blue-white light on face in dark room,
volumetric dust motes drifting through the beam —
intimate yet focused mood.
Shot on Kodak Portra 400, fine organic film grain,
individual hair strands catching backlight.
Faint hum of cooling fans in background atmosphere.

The difference in output quality is dramatic. Physical parameters give the model concrete visual targets rather than abstract aesthetic goals.

The 7-Element Formula

Every b-roll prompt follows this structure:

Element	Example
1. Subject + appearance	”Person in white linen sitting cross-legged”
2. Action + physics	”fingers resting loosely on knees with subtle breathing movement”
3. Scene/environment	”on wooden deck at sunrise, mist rising from lake”
4. Lighting (3-layer)	Source + behavior + color tone (see below)
5. Camera	Focal length in mm + movement phrase
6. Texture	Film stock or render anchor + organic imperfections
7. Micro-action + ambient detail	Subtle human movements + environmental sounds/details

Segment Shot Library

Every b-roll segment has a narrative type that maps to a default cinematography preset:

Type	Focal Length	Camera Movement	Lighting	Film Stock
hook	24mm wide	dolly push-in	dramatic sidelight, volumetric shafts, teal/amber	Kodak Vision3 500T
problem	35mm standard	handheld drift, micro-shake	harsh overhead, deep shadows, cool blue-grey	ARRI ALEXA Mini LF
insight	85mm portrait	slow push-in	soft window light, warm rim separation, golden-hour	Kodak Portra 400
solution	50mm standard	slow orbital (45 arc)	balanced 3-point, god rays, warm amber	35mm film grain
demo	35mm standard	smooth tracking	clean softbox, rim light, neutral tone	Octane rendering
proof	85mm portrait	static + rack focus	natural window light, atmospheric haze, golden	Fuji Pro 400H
cta	24mm wide	slow crane up	dramatic backlight, lens flare, high-contrast warm	Cinestill 800T

3-Layer Lighting Model

Each preset defines lighting as three components:

Source — where the light comes from (e.g., “dramatic sidelight from screen-left”)
Behavior — how the light interacts with the scene (e.g., “volumetric light shafts cutting through atmospheric haze”)
Color Tone — the palette it creates (e.g., “high-contrast teal shadows with warm amber highlights”)

This maps directly to how cinematographers describe their lighting setups.

Diegetic Lighting

The system prefers light sources that exist within the scene (diegetic) over external “studio” descriptions. Diegetic sources feel more authentic and give the model concrete physical objects to render.

Instead of	Use
”cinematic lighting"	"grill flames casting warm flicker on skin"
"dramatic side light"	"laptop screen glow illuminating face in dark room"
"neon lighting"	"neon sign reflections on wet pavement"
"warm lighting"	"candle flame dancing shadows on wall”

Mood Duality

Prompts specify two contrasting moods for richer atmosphere, rather than a single adjective:

“moody yet vibrant”
“intimate but energetic”
“gritty yet beautiful”
“chaotic yet focused”
“raw yet polished”

This tension creates more visually interesting output than flat single-mood descriptions.

Micro-Actions & Ambient Detail

The 7th element adds subtle movements and environmental details that prevent the static, lifeless look of AI-generated video:

Human micro-actions:

“fingers drumming on desk”
“shifting weight from one foot to the other”
“steam rising from coffee cup as hand reaches for it”
“hair caught by breeze”

Environmental ambient details:

“dust motes drifting through light beam”
“condensation sliding down glass”
“leaves trembling in wind”
“distant city sounds bleeding in”

Organic Imperfections

Every preset includes intentional imperfections that prevent the “too-clean AI look”:

Film grain (fine, subtle, or heavy depending on stock)
Dust motes drifting through light shafts
Lens flare on highlights
Bokeh circles in out-of-focus areas
Halation on bright edges (especially Cinestill 800T)
Natural optical vignetting

Prompt Assembly Pipeline

The assemble_cinematic_prompt() function enriches a raw b-roll description in 5 steps:

Raw input:  "Person looking stressed at their desk"

1. Strip banned filler words (masterpiece, 4K, 8K, ultra HD, etc.)
2. Determine segment type (hook/problem/insight/solution/demo/proof/cta)
3. Check if LLM already included camera/lighting/material terms
4. If generic → layer in shot preset (focal length + camera + lighting + film stock)
5. Prepend continuity brief

Banned Filler Words

These words actively degrade AI video quality and are automatically stripped:

masterpiece, 4K, 8K, ultra HD, best quality, highly detailed, photorealistic, ultra clear, ultra realistic, hyper realistic, super detailed

Prompt Validation

validate_broll_prompt() checks that a prompt contains all three categories:

Camera direction — focal length or camera movement term
Lighting — light source, behavior, or color tone term
Texture/material — film stock, bokeh, or organic imperfection term

If any category is missing, the assembler fills it in from the shot preset.

Visual Continuity Brief

The continuity brief is a 1-2 sentence prefix prepended to every visual prompt in a video to maintain consistent aesthetics across all segments.

How It Works

Mood mapping — The video’s mood keyword (e.g., “high-energy”, “moody”, “documentary”) maps to a film stock and base palette:

Mood	Film Stock	Base Palette
high-energy, energetic	Kodak Vision3 500T	warm amber + teal shadow accents
conversational, casual	Kodak Portra 400	warm neutral + soft golden highlights
serious, dark	ARRI ALEXA Mini LF	desaturated cool + neutral midtones
moody, night	Cinestill 800T	cool blue-green + neon-warm highlights
cinematic, dramatic	Kodak Vision3 500T	rich contrast + warm highlights + cool shadows
inspirational	Kodak Portra 400	golden warmth + soft diffused highlights
educational, documentary	ARRI ALEXA Mini LF	naturalistic palette + muted tones

Dominant segment type — The most frequent b-roll segment type determines the color tone from its shot preset.
Output — A string like:

Unified palette: warm amber base with teal shadow accents.
Consistent Kodak Vision3 500T grain across all shots.
Atmospheric haze present in every frame.

This brief is stored as script["_continuity_brief"] and prepended to:

Every b-roll generation prompt
Every talking-head generation prompt

Relationship to Keyframe Grid

The continuity brief provides textual consistency (color grading, grain, atmosphere). The keyframe grid adds visual consistency (pixel-level lighting geometry, character details). Both systems work together — the grid prompt itself includes the continuity brief.

Using Shot Design in Custom Workflows

The module is importable from any workflow:

from workflows.video._shot_design import (
    assemble_cinematic_prompt,
    validate_broll_prompt,
    strip_filler,
    get_shot_preset,
    generate_continuity_brief,
    SEGMENT_SHOT_LIBRARY,
)

# Enrich a plain description
prompt = assemble_cinematic_prompt(
    narration="The future of renewable energy",
    broll_desc="Solar panels stretching across a desert landscape",
    segment_type="solution",
    segment_index=2,
    total_segments=5,
    continuity_brief="Unified palette: golden warmth...",
)

# Check prompt quality
passed, missing = validate_broll_prompt(prompt)
if not passed:
    print(f"Missing: {missing}")

# Get raw presets
preset = get_shot_preset("hook")
print(preset["focal_length"])  # "24mm wide-angle lens"