Shot Design & Visual Consistency
Fabric uses a cinematography vocabulary inspired by Seedance 2.0 Shot Design to produce cinema-grade video from AI models. Instead of vague adjectives like “cinematic” or “high quality”, prompts use specific physical parameters — focal length in millimeters, 3-layer lighting descriptions, and named film stocks — that video models can execute precisely.
This system lives in workflows/video/_shot_design.py and is reusable by any workflow that generates video prompts.
The Problem with Generic Prompts
Section titled “The Problem with Generic Prompts”Most AI video prompts look like this:
A person walking through a city, cinematic, high quality, 4K, masterpieceThis produces generic, “AI-looking” video. The model has no physical parameters to work with — “cinematic” means nothing specific.
The Solution: Physical Parameters
Section titled “The Solution: Physical Parameters”Fabric enriches every b-roll prompt with a 7-element formula of physical description:
Close-up of hands typing on a mechanical keyboard,fingers drumming between keystrokes, subtle shift in posture.85mm portrait lens, slow deliberate push-in.Monitor glow casting blue-white light on face in dark room,volumetric dust motes drifting through the beam —intimate yet focused mood.Shot on Kodak Portra 400, fine organic film grain,individual hair strands catching backlight.Faint hum of cooling fans in background atmosphere.The difference in output quality is dramatic. Physical parameters give the model concrete visual targets rather than abstract aesthetic goals.
The 7-Element Formula
Section titled “The 7-Element Formula”Every b-roll prompt follows this structure:
| Element | Example |
|---|---|
| 1. Subject + appearance | ”Person in white linen sitting cross-legged” |
| 2. Action + physics | ”fingers resting loosely on knees with subtle breathing movement” |
| 3. Scene/environment | ”on wooden deck at sunrise, mist rising from lake” |
| 4. Lighting (3-layer) | Source + behavior + color tone (see below) |
| 5. Camera | Focal length in mm + movement phrase |
| 6. Texture | Film stock or render anchor + organic imperfections |
| 7. Micro-action + ambient detail | Subtle human movements + environmental sounds/details |
Segment Shot Library
Section titled “Segment Shot Library”Every b-roll segment has a narrative type that maps to a default cinematography preset:
| Type | Focal Length | Camera Movement | Lighting | Film Stock |
|---|---|---|---|---|
| hook | 24mm wide | dolly push-in | dramatic sidelight, volumetric shafts, teal/amber | Kodak Vision3 500T |
| problem | 35mm standard | handheld drift, micro-shake | harsh overhead, deep shadows, cool blue-grey | ARRI ALEXA Mini LF |
| insight | 85mm portrait | slow push-in | soft window light, warm rim separation, golden-hour | Kodak Portra 400 |
| solution | 50mm standard | slow orbital (45 arc) | balanced 3-point, god rays, warm amber | 35mm film grain |
| demo | 35mm standard | smooth tracking | clean softbox, rim light, neutral tone | Octane rendering |
| proof | 85mm portrait | static + rack focus | natural window light, atmospheric haze, golden | Fuji Pro 400H |
| cta | 24mm wide | slow crane up | dramatic backlight, lens flare, high-contrast warm | Cinestill 800T |
3-Layer Lighting Model
Section titled “3-Layer Lighting Model”Each preset defines lighting as three components:
- Source — where the light comes from (e.g., “dramatic sidelight from screen-left”)
- Behavior — how the light interacts with the scene (e.g., “volumetric light shafts cutting through atmospheric haze”)
- Color Tone — the palette it creates (e.g., “high-contrast teal shadows with warm amber highlights”)
This maps directly to how cinematographers describe their lighting setups.
Diegetic Lighting
Section titled “Diegetic Lighting”The system prefers light sources that exist within the scene (diegetic) over external “studio” descriptions. Diegetic sources feel more authentic and give the model concrete physical objects to render.
| Instead of | Use |
|---|---|
| ”cinematic lighting" | "grill flames casting warm flicker on skin" |
| "dramatic side light" | "laptop screen glow illuminating face in dark room" |
| "neon lighting" | "neon sign reflections on wet pavement" |
| "warm lighting" | "candle flame dancing shadows on wall” |
Mood Duality
Section titled “Mood Duality”Prompts specify two contrasting moods for richer atmosphere, rather than a single adjective:
- “moody yet vibrant”
- “intimate but energetic”
- “gritty yet beautiful”
- “chaotic yet focused”
- “raw yet polished”
This tension creates more visually interesting output than flat single-mood descriptions.
Micro-Actions & Ambient Detail
Section titled “Micro-Actions & Ambient Detail”The 7th element adds subtle movements and environmental details that prevent the static, lifeless look of AI-generated video:
Human micro-actions:
- “fingers drumming on desk”
- “shifting weight from one foot to the other”
- “steam rising from coffee cup as hand reaches for it”
- “hair caught by breeze”
Environmental ambient details:
- “dust motes drifting through light beam”
- “condensation sliding down glass”
- “leaves trembling in wind”
- “distant city sounds bleeding in”
Organic Imperfections
Section titled “Organic Imperfections”Every preset includes intentional imperfections that prevent the “too-clean AI look”:
- Film grain (fine, subtle, or heavy depending on stock)
- Dust motes drifting through light shafts
- Lens flare on highlights
- Bokeh circles in out-of-focus areas
- Halation on bright edges (especially Cinestill 800T)
- Natural optical vignetting
Prompt Assembly Pipeline
Section titled “Prompt Assembly Pipeline”The assemble_cinematic_prompt() function enriches a raw b-roll description in 5 steps:
Raw input: "Person looking stressed at their desk"
1. Strip banned filler words (masterpiece, 4K, 8K, ultra HD, etc.)2. Determine segment type (hook/problem/insight/solution/demo/proof/cta)3. Check if LLM already included camera/lighting/material terms4. If generic → layer in shot preset (focal length + camera + lighting + film stock)5. Prepend continuity briefBanned Filler Words
Section titled “Banned Filler Words”These words actively degrade AI video quality and are automatically stripped:
masterpiece, 4K, 8K, ultra HD, best quality, highly detailed, photorealistic, ultra clear, ultra realistic, hyper realistic, super detailed
Prompt Validation
Section titled “Prompt Validation”validate_broll_prompt() checks that a prompt contains all three categories:
- Camera direction — focal length or camera movement term
- Lighting — light source, behavior, or color tone term
- Texture/material — film stock, bokeh, or organic imperfection term
If any category is missing, the assembler fills it in from the shot preset.
Visual Continuity Brief
Section titled “Visual Continuity Brief”The continuity brief is a 1-2 sentence prefix prepended to every visual prompt in a video to maintain consistent aesthetics across all segments.
How It Works
Section titled “How It Works”- Mood mapping — The video’s mood keyword (e.g., “high-energy”, “moody”, “documentary”) maps to a film stock and base palette:
| Mood | Film Stock | Base Palette |
|---|---|---|
| high-energy, energetic | Kodak Vision3 500T | warm amber + teal shadow accents |
| conversational, casual | Kodak Portra 400 | warm neutral + soft golden highlights |
| serious, dark | ARRI ALEXA Mini LF | desaturated cool + neutral midtones |
| moody, night | Cinestill 800T | cool blue-green + neon-warm highlights |
| cinematic, dramatic | Kodak Vision3 500T | rich contrast + warm highlights + cool shadows |
| inspirational | Kodak Portra 400 | golden warmth + soft diffused highlights |
| educational, documentary | ARRI ALEXA Mini LF | naturalistic palette + muted tones |
-
Dominant segment type — The most frequent b-roll segment type determines the color tone from its shot preset.
-
Output — A string like:
Unified palette: warm amber base with teal shadow accents.Consistent Kodak Vision3 500T grain across all shots.Atmospheric haze present in every frame.This brief is stored as script["_continuity_brief"] and prepended to:
- Every b-roll generation prompt
- Every talking-head generation prompt
Relationship to Keyframe Grid
Section titled “Relationship to Keyframe Grid”The continuity brief provides textual consistency (color grading, grain, atmosphere). The keyframe grid adds visual consistency (pixel-level lighting geometry, character details). Both systems work together — the grid prompt itself includes the continuity brief.
Using Shot Design in Custom Workflows
Section titled “Using Shot Design in Custom Workflows”The module is importable from any workflow:
from workflows.video._shot_design import ( assemble_cinematic_prompt, validate_broll_prompt, strip_filler, get_shot_preset, generate_continuity_brief, SEGMENT_SHOT_LIBRARY,)
# Enrich a plain descriptionprompt = assemble_cinematic_prompt( narration="The future of renewable energy", broll_desc="Solar panels stretching across a desert landscape", segment_type="solution", segment_index=2, total_segments=5, continuity_brief="Unified palette: golden warmth...",)
# Check prompt qualitypassed, missing = validate_broll_prompt(prompt)if not passed: print(f"Missing: {missing}")
# Get raw presetspreset = get_shot_preset("hook")print(preset["focal_length"]) # "24mm wide-angle lens"