Skip to content

Video Production Workflows

Video production workflows handle standalone video generation tasks — face swapping, talking-head avatars, motion transfer, slideshows, Reddit story videos, and formula-driven style cloning. For composed multi-stage pipelines (research → video, hot topics → video), see Composed Pipelines.

All video workflows support fully local execution with no external API calls. Use quality=local or quality=local-power to run entirely on-device.

ModelVRAMQualitySpeedNotes
wan:1.3b8 GBGoodFastDefault for local preset
wan:14b24 GBHighModerateWAN 2.1
wan2.2:14b24 GBBestModerateWAN 2.2 (MoE), default for local-power
ltx-video:2.38 GBGoodFastLTX-2.3 distilled
ltx-video:28 GBGoodFastLTX-2
hunyuanvideo24 GBHighModerateCUDA only (Tencent)
open-sora16 GBGoodModerateCUDA only (HPC-AI Tech)
cogvideox:2b6 GBDecentSlowCUDA only
cogvideox:5b12 GBGoodSlowCUDA only
ModelQualityVoice CloningNotes
kokoroHighNoDefault local TTS, 6 voices (3M/3F)
chatterboxHighYesZero-shot voice cloning from single sample
voxtralHighNoApple Silicon only (MLX)
piperDecentNoLightweight, CPU-friendly
ModelVRAMQualityNotes
musicgen-small4 GBDecentDefault for local preset
musicgen-medium8 GBGoodDefault for local-power
musicgen-large12 GBHighBest MusicGen quality
stable-audio-open6 GBHighStable Audio Open 1.0 via diffusers
ModelVRAMQualityNotes
qwen3:8b5 GBGoodDefault for local preset
qwen3:14b9 GBHighDefault for local-power
qwen3:latest19 GBBest32B Q4, for high-end hardware
gemma3:4b3 GBDecentLightweight, for local-light

Override any model per-run: --input broll_model="wan2.2:14b" or --input bgm_model="stable-audio-open".


Workflow: video/face-swap

Swaps a persona face onto a source image or video using FAL, then uploads the result as a Fabric asset.

Terminal window
# Direct URLs
fab-workflow video/face-swap \
-i source_url=https://example.com/dance-still.jpg \
-i target_url=https://example.com/persona-face.png \
-v
# From org gallery
fab-workflow video/face-swap \
-i source_url=https://example.com/dance-still.jpg \
-i persona_gallery_id=YOUR_GALLERY_ID \
-i persona_index=0
resolve_inputs → swap_face (FAL) → upload_result
ParameterTypeDefaultDescription
source_urlstringrequiredBase image/video to swap the face into
target_urlstringDirect URL to the persona face image
persona_gallery_idstringOrg gallery ID to pull the face from
persona_indexint0Which gallery item to use

One of target_url or persona_gallery_id is required.

{
"output_path": "/tmp/face-swap-result.png",
"output_asset_id": "018f...",
"output_url": "https://...",
"workflow": "face-swap"
}

Workflow: video/avatar

Generates a single lip-synced talking-head video from a portrait image and audio clip. Uses the same provider dispatch as the AI Shorts pipeline but exposed as a standalone workflow.

Terminal window
# Local files
fab-workflow video/avatar \
--input actor.path=./portrait.png \
--input audio.path=./voiceover.mp3 \
--input prompt="Person speaking to camera, warm lighting"
# With a specific model
fab-workflow video/avatar \
--input actor.path=./portrait.png \
--input audio.path=./voiceover.mp3 \
--input avatar_model="seedance/2.0"
render_avatar (resolve refs → generate talking head → persist)
ParameterTypeDefaultDescription
actorAssetRefrequiredPortrait image (PNG/JPG) — face must be clearly visible
audioAssetRefrequiredAudio clip to lip-sync (MP3/WAV)
promptstring"Person speaking directly to camera..."Stylistic prompt for avatar generation
avatar_modelstring"fal-ai/kling-video/ai-avatar/v2/standard"Provider/model ID

AssetRef accepts Fabric asset IDs, HTTP(S) URLs, or local file paths.

ModelStyleNotes
fal-ai/kling-video/ai-avatar/v2/standardLip-sync + head movementDefault, good quality
seedance/2.0Full-body animationHigher quality, slower
bytedance/omnihuman*Research-gradeExperimental
{
"video_path": "/tmp/avatar.mp4",
"asset_id": "018f...",
"avatar_model": "fal-ai/kling-video/ai-avatar/v2/standard"
}

Workflow: video/talking-head

Generates a complete talking-head video from just a topic or script — no pre-existing audio or portrait required. Handles script generation, actor portrait creation, optional voice cloning, TTS, and avatar rendering in a single pipeline. Unlike video/avatar (which requires pre-made audio + portrait), this workflow is fully end-to-end.

Terminal window
# Minimal — just a topic
fab-workflow video/talking-head \
--input topic="3 tips for better sleep"
# With explicit script and voice style
fab-workflow video/talking-head \
--input script_text="Hey everyone, here are three tips for sleeping better..." \
--input voice_style="warm female" \
--input presenter_look="professional woman in her 30s, dark hair"
# With voice cloning from a sample
fab-workflow video/talking-head \
--input topic="3 tips for better sleep" \
--input voice_sample.path=./my-voice-sample.mp3
# With a pre-existing actor portrait
fab-workflow video/talking-head \
--input topic="3 tips for better sleep" \
--input actor.path=./portrait.png \
--input voice_id="TxGEqnHWrfWFTfGW9XjX"
prepare_script
┌───┴───┐
resolve resolve
_actor _voice
└───┬───┘
generate_tts → render_talking_head → finalize_output
ParameterTypeDefaultDescription
topicstringVideo topic (auto-generates script). Required when script_text is empty
script_textstringExplicit narration text (skips script generation)
actorAssetRefActor portrait image. Auto-generated when omitted
presenter_lookstring""Actor appearance for AI generation (ignored when actor is set)
voice_sampleAssetRefAudio sample for voice cloning (MP3/WAV)
voice_idstringExplicit TTS voice ID (overrides voice_sample and auto-selection)
voice_stylestring""Descriptive voice style (e.g. "warm female", "deep narrator")
avatar_modelstring"fal-ai/kling-video/ai-avatar/v2/standard"Avatar model override
duration_secsint30Target duration when generating a script
languagestring"en"Script and TTS language
{
"video_path": "/tmp/talking-head.mp4",
"asset_id": "018f...",
"script_text": "Hey everyone, here are three tips...",
"voice_id": "TxGEqnHWrfWFTfGW9XjX",
"avatar_model": "fal-ai/kling-video/ai-avatar/v2/standard"
}

Workflow: video/motion-transfer

Animates a character image using a reference video’s motion (dance, gestures, expressions). The character performs the same movements as the driving video.

Terminal window
# Direct URLs
fab-workflow video/motion-transfer \
-i source_image_url=https://example.com/persona.png \
-i driving_video_url=https://example.com/dance-reference.mp4
# With Seedance (full-body, higher quality)
fab-workflow video/motion-transfer \
-i source_image_url=https://example.com/persona.png \
-i driving_video_url=https://example.com/dance-reference.mp4 \
-i motion_model="seedance/2.0"
# From org gallery
fab-workflow video/motion-transfer \
-i persona_gallery_id=YOUR_GALLERY_ID \
-i driving_video_url=https://example.com/dance.mp4
resolve_inputs → transfer_motion (FAL/Seedance) → upload_result
ParameterTypeDefaultDescription
source_image_urlstringCharacter image URL
driving_video_urlstringrequiredMotion reference video URL
persona_gallery_idstringOrg gallery ID to pull the character from
persona_indexint0Which gallery item to use
motion_modelstring"fal-ai/live-portrait"Provider/model for motion transfer

One of source_image_url or persona_gallery_id is required.

{
"output_path": "/tmp/motion-transfer-result.mp4",
"output_asset_id": "018f...",
"output_url": "https://...",
"workflow": "motion-transfer"
}

Workflow: video/slideshow

Generates listicle carousels with triple output from a single run: individual PNG slides (for native carousel upload), stitched MP4 video, and multi-page PDF (for LinkedIn document carousels). Includes cross-run topic deduplication.

Terminal window
# Basic
fab-workflow video/slideshow \
--input topic="5 AI tools that replaced my entire workflow" \
--input quality=standard
# Full control
fab-workflow video/slideshow \
--input topic="5 AI tools that replaced my entire workflow" \
--input niche="productivity" \
--input quality=premium \
--input slide_layout=numbered \
--input mixed_media=true \
--input num_items=7
generate_listicle
┌────┼──────────────────────────────────┐
source_ generate_ generate_ generate_
images voiceover bgm video_hook
└────┼──────────────────────────────────┘
merge_slideshow_assets → compose_slides → export_pngs →
assemble_video → generate_pdf → platform_metadata
PresetImagesTTSMusicScript Model
freeStockPiper (local)Skipqwen3:8b (local)
budgetAI-generatedFAL KokoroStable Audiogemini-2.5-flash
standardAI-generatedElevenLabsStable Audiogemini-2.5-flash
premiumAI-generatedElevenLabsStable Audiogemini-2.5-flash
localStockKokoro (local)MusicGen (local)qwen3:8b (local)
ParameterTypeDefaultDescription
topicstringrequiredListicle topic
nichestring""Content niche
num_itemsint5Number of list items
qualitystring"standard"Quality preset
slide_layoutstring"overlay_gradient"Layout: overlay_gradient, numbered, card
slide_durationfloat3.5Seconds per slide in the video
transition_durationfloat0.5Transition time between slides
mixed_mediaboolfalseGenerate a video clip for the hook slide (Instagram mixed-media carousel)
style_hintstring""Visual style hint for image sourcing
brand_kitdict{}Brand colors, fonts, logo
languagestring"en"Language for script and voice
{
"output": {
"slide_png_paths": ["/tmp/slide_01_hook.png", "/tmp/slide_02_item.png", "..."],
"video_path": "/tmp/slideshow.mp4",
"pdf_path": "/tmp/slideshow.pdf",
"hook_video_path": "/tmp/hook_video.mp4",
"platform_metadata": {
"tiktok": {"format": "carousel", "description": "...", "hashtags": ["..."]},
"instagram": {"format": "mixed_media_carousel", "caption": "..."},
"youtube_shorts": {"format": "video", "title": "..."},
"linkedin": {"format": "document", "post_copy": "..."}
},
"topic": "5 AI tools that replaced my entire workflow",
"hook_text": "I deleted 12 apps last month. These 5 replaced all of them.",
"slide_count": 7
}
}

Workflow: video/reddit-stories

Converts Reddit posts into narrated videos with screenshot overlays on background video — the classic Reddit story video format. Two modes: comments (top comments as individual segments) and story (post body paginated into segments).

Terminal window
# From a specific Reddit post
fab-workflow video/reddit-stories \
--input reddit_url="https://reddit.com/r/AskReddit/comments/..." \
--input mode=comments \
--input quality=standard
# Search a subreddit for a topic
fab-workflow video/reddit-stories \
--input subreddit=tifu \
--input topic="worst cooking disaster" \
--input mode=story
# Local-only (free TTS, no background video)
fab-workflow video/reddit-stories \
--input reddit_url="https://reddit.com/r/..." \
--input quality=free
PresetTTSBackground VideoScreenshotsSubtitlesVoice Mode
freePiper (local)None1x scaleNoSingle
budgetKokoro (local)Random2x scaleYesAlternating
standardElevenLabsRandom2x scaleYesAlternating
localKokoro (local)Random1x scaleYesAlternating
ParameterTypeDefaultDescription
reddit_urlstringDirect URL to a Reddit post
subredditstringSubreddit to search in
topicstringSearch query within subreddit
modestring"comments""comments" or "story"
max_commentsint6Max comments to include
min_scoreint10Min upvote score for comments
min_lengthint100Min comment length (chars)
max_lengthint600Max comment length (chars)
themestring"dark"Screenshot theme: dark, light, transparent
overlay_opacityfloat1.0Screenshot overlay opacity (0.0–1.0)
tts_modelstringpresetTTS model override
voice_modestringpreset"single", "alternating", or "random"
background_videostringpresetVideo ID, category, "random", or "none"
qualitystring"standard"Quality preset
platformstring"youtube_shorts"Target platform
languagestring"en"Target language

At least one of reddit_url, subreddit, or topic is required.


Workflow: global/formula-shorts

Loads a VideoFormula (extracted style DNA from reference videos) and maps its constraints — hook type, segment structure, camera work, color grade, subtitle style, voice, music — onto the AI Shorts pipeline. The formula acts as a style template; the topic is new.

Terminal window
# Named formula (from ~/.fabric/formulas/)
fab-workflow global/formula-shorts \
--input formula="my-style" \
--input topic="Why cold showers boost productivity"
# Formula file path
fab-workflow global/formula-shorts \
--input formula="/path/to/formula.json" \
--input topic="The psychology of procrastination"
load_formula → apply_formula_inputs → [ai_shorts_pipeline]

The apply_formula_inputs task maps formula fields to AI Shorts overrides:

Formula SectionMapped To
script.hook_typeHook style constraint in script generation
script.tonemood override
script.structure.segment_typesSegment arc (e.g., hook → context → insight → CTA)
visuals.overall_aestheticvisual_style with color palette
visuals.camera_workCamera style constraints in b-roll prompts
audio.voiceVoice gender and style overrides
audio.musicMusic mood and volume
effects.subtitlesSubtitle highlight/inactive/outline colors, position, size
effects.color_gradeFFmpeg filter hint and effects style
effects.hook_overlayHook text overlay style
ParameterTypeDefaultDescription
formulastring or dictrequiredFormula name, file path, or inline dict
topicstringrequiredVideo topic (passed through to AI Shorts)
All AI Shorts inputsAny AI Shorts parameter can be passed; explicit values override formula defaults

Workflow: video/recreate-from-format

Takes format DNA and a recreation blueprint from research/reference-format-analysis and generates an original reel plan — script, shotlist, overlay captions, and edit recipe — that follows the same structural format without copying source content. Supports generating N variants per run.

Terminal window
# Generate a single variant
fab-workflow video/recreate-from-format \
-i format_dna_json_path="output/analysis/format_dna.json" \
-i recreation_blueprint_json_path="output/analysis/recreation_blueprint.json" \
-i topic="5 morning habits for productivity" \
-i creator_persona="confident lifestyle coach" \
-i tone="energetic" \
-i platform="instagram_reels" \
-o output/recreation/
# Generate 3 variants with TTS
fab-workflow video/recreate-from-format \
-i format_dna_json_path="output/analysis/format_dna.json" \
-i recreation_blueprint_json_path="output/analysis/recreation_blueprint.json" \
-i topic="Why cold showers boost focus" \
-i creator_persona="health-focused creator, calm authority" \
-i tone="informative" \
-i platform="tiktok" \
-i variants=3 \
-i include_tts_audio=true \
-o output/recreation/
load_and_extract_constraints → generate_scripts → generate_shotlists_and_overlays → generate_edit_recipes → emit_recreation_artifacts

Each variant gets its own directory (variant_0/, variant_1/, …) containing:

  • script.json — structured script with per-segment narration, visual intent, emotional intensity
  • script.md — human-readable formatted script
  • shotlist.json — beat-by-beat shot plan with framing, motion, composition type
  • overlay_captions.json — timed on-screen captions matching the reference format’s overlay style
  • edit_recipe.json — machine-readable edit plan with transitions, pauses, music guidance, emphasis points
  • voiceover.txt — narration text (if voiceover enabled)
  • voiceover.wav — TTS audio (if TTS enabled)
reference_reel.mp4
research/reference-format-analysis
├── format_dna.json
├── recreation_blueprint.json
└── summary.md
video/recreate-from-format
├── variant_0/script.json
├── variant_0/shotlist.json
├── variant_0/overlay_captions.json
├── variant_0/edit_recipe.json
└── recreation_manifest.json