Long-Form Video Pipeline
The Long-Form Video pipeline generates YouTube-ready videos from 5 to 120+ minutes. It produces chapter-based output with visual consistency, background music, subtitles, thumbnails, and YouTube metadata with chapter timestamps.
Powered by Seedance 2.0 at premium/ultra quality tiers for cinema-grade video generation with native audio and lip-sync.
Pipeline Architecture
Section titled “Pipeline Architecture” route_script_input │ ┌─────────────┼─────────────┐ │ │ │ [topic] [synopsis] [full_script] research research parse into → structure → expand acts → generate → score → score │ │ │ └─────────────┼─────────────┘ │ estimate_cost (--dry-run stops here) │ generate_anchor_keyframes (Seedance refs) │ generate_scene_backgrounds (green screen) │ produce_chapters (parallel × 3) ├─ voiceover (locked voice) ├─ b-roll OR green screen composite └─ assembly │ apply_scene_layers (VFX layer editing) │ generate_chapter_transitions (Seedance i2v) │ concatenate_all │ add_background_music (speech ducking) │ transcribe_and_subtitle │ video_output_gate (ffprobe validation) │ generate_thumbnail + youtube_metadata │ collect_long_form_outputQuick Start
Section titled “Quick Start”# Topic mode — research + script generationfab video/long-form "The Rise and Fall of Theranos" --structure documentary --duration 20min
# Synopsis mode — expand a story ideafab video/long-form --story "A programmer discovers a security flaw in the world's largest bank" \ --structure hero_journey --duration 15min
# Full script mode — just produce the videofab video/long-form --story @my_script.txt --quality premium
# Topic + story anglefab video/long-form "Chernobyl disaster" \ --story "From the perspective of a firefighter first on scene" \ --structure documentary --duration 25min
# Dry run — estimate cost without producingfab video/long-form "SpaceX" --dry-run --quality premium --duration 10minfrom fabric_platform import FabricClient
fabric = FabricClient()run = fabric.run_workflow("video/long-form", input={ "topic": "The Rise and Fall of Theranos", "structure": "documentary", "duration_target": "20min", "quality": "premium", "mood": "dramatic",})result = fabric.wait_for_run(run["id"])print(result["output"]["video_path"])print(result["output"]["youtube_metadata"]["youtube_title"])import { FabricClient } from "@fabric-platform/sdk";const fabric = new FabricClient();
const run = await fabric.workflows.runs.submit("video/long-form", { topic: "The Rise and Fall of Theranos", structure: "documentary", duration_target: "20min", quality: "premium",});const result = await fabric.workflows.runs.waitForRun(run.id);console.log(result.output.video_path);Three Input Modes
Section titled “Three Input Modes”| Mode | Trigger | What happens |
|---|---|---|
| Topic | Only topic provided | Full pipeline: deep research, structure selection, script generation |
| Synopsis | story < 500 words | Research supplements the story idea, LLM expands into full script |
| Full Script | story > 500 words | Script parsed into acts, video produced directly (no research) |
When both topic and story are provided, topic provides the research subject and story acts as the narrative angle.
Script Structures
Section titled “Script Structures”| Structure | Description | Best for |
|---|---|---|
documentary | Rise-fall-aftermath with cold open | Biographical, institutional stories |
hero_journey | Monomyth transformation arc | Personal stories, startup journeys |
true_crime | Mystery pacing with layered revelations | Crime, investigation, conspiracy |
listicle | Ranked items with escalating interest | Top-N lists, comparisons |
problem_solution | Pain, failed approaches, solution | Tutorials, how-to content |
explainer | Layered complexity with aha moments | Science, economics, complex topics |
auto (default) | LLM selects best fit for your topic | When unsure |
Visual Controls
Section titled “Visual Controls”| Parameter | Values | Default | Description |
|---|---|---|---|
aspect_ratio | 16:9, 9:16 | 16:9 | Landscape (YouTube) or portrait (Shorts) |
broll_density | low, medium, high, all_broll | medium | Presenter vs b-roll ratio |
visual_style | any string | auto | Visual aesthetic direction |
presenter_look | any string | none | Actor description (requires --talking-heads) |
include_talking_heads | flag | false | Add AI presenter segments |
B-roll density controls how much screen time goes to the presenter vs b-roll footage:
low— 80% presenter, 20% b-roll (talking head style)medium— 50/50 balancedhigh— 80% b-roll, 20% presenterall_broll— pure b-roll narration (documentary style, no presenter)
Seedance 2.0 Provider
Section titled “Seedance 2.0 Provider”At premium and ultra quality tiers, the pipeline uses ByteDance’s Seedance 2.0 for video generation. Key advantages:
- Multi-reference input — up to 12 reference images for visual consistency across chapters
- Native audio generation — synchronized audio + video in a single generation
- Phoneme-perfect lip-sync in 8+ languages
- Video extension — smooth transitions between chapters
- Second-by-second timeline prompts — frame-level control within each clip
Anchor keyframes are generated before chapter production and passed as references to every Seedance call, maintaining visual DNA across a 20+ minute video.
Quality Presets
Section titled “Quality Presets”| Tier | B-Roll Model | Avatar | Lipsync | Est. Cost (10min) |
|---|---|---|---|---|
free | skip (stock) | skip | skip | $0 |
budget | Veo 3.1 Fast | Kling Avatar | FAL Lipsync | ~$2 |
standard | Veo 3.1 Fast | Kling Avatar | FAL Lipsync | ~$5 |
premium | Seedance 2.0 | Seedance 2.0 | native | ~$8 |
ultra | Seedance 2.0 | Seedance 2.0 | native | ~$12 |
Cost Estimation
Section titled “Cost Estimation”Use --dry-run to see a cost breakdown before committing:
fab video/long-form "SpaceX" --quality premium --duration 10min --dry-runOutput shows per-component costs: TTS, b-roll clips, transitions, BGM, thumbnail.
YouTube Integration
Section titled “YouTube Integration”The pipeline generates YouTube-ready metadata:
- Chapter timestamps — derived from act boundaries, formatted for YouTube descriptions
- SEO title — under 60 characters, optimized for search
- Description — summary + chapter timestamps + related video suggestions
- Tags — 15-20 SEO-optimized tags
- Thumbnail — 1280x720 with hook text overlay
Voice Continuity
Section titled “Voice Continuity”The voice ID is locked after the first chapter generates successfully. All subsequent chapters use the exact same voice, preventing drift across a multi-chapter video.
You can also provide your own voice via:
--voice <voice_id>— use a specific TTS voice ID--voice-sample <path>— clone from an audio sample (ElevenLabs or FAL Chatterbox)
Prompt Customization
Section titled “Prompt Customization”All LLM prompts are stored as external template files in prompts/long_form/. Override any template by placing a file at ~/.fabric/prompts/long_form/{name}.txt.
Available templates: script_generation, synopsis_expansion, script_parsing, retention_scoring, structure_detection, youtube_metadata, seedance_cinematic.
Input Reference
Section titled “Input Reference”| Field | Type | Default | Description |
|---|---|---|---|
topic | string | — | Video topic (triggers research mode) |
story | string | — | Full script or synopsis |
structure | enum | auto | Narrative archetype |
duration_target | string | 15min | Target duration |
quality | enum | standard | Cost tier (free/budget/standard/premium/ultra) |
platform | string | YouTube | Target platform |
mood | string | auto | Emotional tone |
audience | string | general | Target audience |
research_depth | int | 5 | Research depth 1-10 |
aspect_ratio | enum | 16:9 | Video aspect ratio |
include_talking_heads | bool | false | Include AI presenter |
presenter_look | string | — | Actor description |
visual_style | string | — | Visual direction |
broll_density | enum | medium | B-roll vs presenter ratio |
voice_style | enum | — | Voice preset |
voice_id | string | — | Explicit TTS voice ID |
voice_sample_path | string | — | Audio sample for cloning |
subtitle_style | string | — | Subtitle styling |
intro_style | enum | cold_open | Intro sequence type |
outro_strategy | enum | subscribe_tease | CTA strategy |
greenscreen_footage_url | string | — | Green screen footage URL for AI background replacement |
scene_layers | array | — | Visual elements to add: [{url, role, placement, description}] |
dry_run | bool | false | Estimate cost only |
Output Reference
Section titled “Output Reference”| Field | Type | Description |
|---|---|---|
video_path | string | Path to final assembled video |
script | object | Full structured script with acts |
chapter_paths | string[] | Per-chapter video paths |
thumbnail_path | string | Generated thumbnail |
youtube_metadata | object | Title, description, chapters, tags |
word_count | int | Total script words |
estimated_duration_min | float | Estimated duration |
cost_estimate | object | Per-component cost breakdown |
VFX Compositing
Section titled “VFX Compositing”The long-form pipeline supports two VFX modes that activate via input parameters:
Green screen compositing — provide greenscreen_footage_url and the pipeline automatically generates per-act backgrounds from the script’s visual cues, then composites the footage onto each background with camera motion preservation. Falls back to ffmpeg chroma-key when Seedance is unavailable.
fab video/long-form "History of Cinema" \ --quality premium \ --greenscreen-footage-url "https://cdn.example.com/presenter.mp4"Layer-by-layer scene editing — provide scene_layers to add elements (vehicles, weather, objects) into every chapter. Layers are applied sequentially after chapter production for highest quality.
fab video/long-form "Future Cities" \ --quality premium \ --scene-layers '[{"url": "https://example.com/drone.jpg", "role": "vehicle", "placement": "sky", "description": "delivery drone"}]'See the VFX Compositing guide for full documentation, SDK examples, and cost breakdown.