Skip to content

video/ai-shorts

AI Shorts pipeline — generate production-quality short-form video from a topic.

Category: video
Source: workflows/video/ai_shorts.py

FieldTypeDefaultDescription
actor_image_urlobjectPublic URL to a portrait image for the talking-head presenter. When provided, generates a talking-head avatar video with lip sync. When omitted, generates a faceless motion-graphics style short.
audiencestring"general"Target audience persona: ‘general’, ‘teens’, ‘professionals’, ‘technical’, ‘kids’
duration_secondsinteger45Target duration in seconds (15-120)
duration_secsobjectAlias for duration_seconds (set by normalize_input)
hookstring""Specific hook line (auto-generated if empty)
moodstring""Direct mood override (bypasses style mapping)
platformstring"TikTok"Target platform (affects pacing and framing)
platformsstring[]Platform slugs for export variants: ‘tiktok’, ‘instagram’, ‘youtube_shorts’, ‘youtube’, ‘linkedin’. When set, produces resized + metadata-enriched exports per platform in the output.
presenter_lookstring""AI actor appearance description
qualitystring"fast-local"Quality preset. Default ‘fast-local’ keeps the script step on remote Gemini (cheap + fast) but pushes voiceover, b-roll, and music to local backends — saves ~$0.05–$0.15 per run when local backends are installed, falls through to remote per-stage when they’re not. Other values: ‘local’ (fully local), ‘local-light’ (minimal local, skips video), ‘local-power’ (heavy local), ‘fast-local-avatar’, ‘premium-local’, ‘fast’, ‘free’, ‘budget’, ‘cheap’, ‘standard’, ‘premium’, ‘ultra’. Pass an empty string to bypass the preset and let per-stage model resolution decide each model independently.
regenerateobjectWhen set, this run is a regeneration. Workflows may read direction / keep / extra_instructions to modulate prompts; the engine persists parent_run_id and parent_variant_index as run lineage columns.
stylestring"educational"Video style: ‘educational’, ‘promotional’, ‘storytelling’, ‘listicle’
topicstringrequiredThe subject of the short video
variantsinteger1Number of independent variant executions (1–10). When > 1, the engine runs the workflow N times with different sampling, producing N outputs.
visual_stylestring""Override visual aesthetic (e.g. ‘neon cyberpunk’)
voicestring"alloy"TTS voice style for narration (e.g. ‘narrator’, ‘warm’, ‘energetic-female’)
FieldTypeDefaultDescription
audio_asset_idobjectFabric asset ID of the TTS audio
evaluationobjectEngagement evaluation: {overall_score: number, ready_to_publish: boolean}
kindstring"video"Card kind for consumers — always ‘video’.
platform_exportsobjectPlatform-specific video exports with metadata (when platforms is set)
scriptstringrequiredThe narration script text
tagsstring[]Hashtags/keywords
titlestringrequiredGenerated title for the short
video_asset_idobjectFabric asset ID of the generated video
normalize_input → generate_script → generate_keyframes → resolve_character_refs → merge_pre_production → reconcile_continuity → multiview_character_prep → generate_ai_actor → generate_voiceover → generate_all_broll → generate_bgm → merge_generation → generate_talking_heads → lipsync_talking_heads → mix_audio → compose_timeline → prepare_for_post_processing → burn_subtitles → burn_hook_overlay_safe → generate_effect_filters → apply_effects → add_outro_fade → ai-shorts-output → collect_final_output → export_for_platforms → evaluate_output → collect_ai_shorts_output
TaskDescription
normalize_inputNormalize simplified input params to the pipeline’s internal keys.
generate_scriptGenerate viral script — delegates to SDK stage.
generate_keyframesGenerate 2x2 grid keyframes for visual consistency across b-roll segments.
resolve_character_refsNormalize character_refs (URL/path/asset_id) → resolved URLs.
merge_pre_productionMerge parallel pre-production branches (keyframes + character refs).
reconcile_continuityVision-based continuity reconciliation (opt-in).
multiview_character_prepGenerate multi-view character references for stronger visual consistency.
generate_ai_actorGenerate AI actor portrait — delegates to SDK stage.
generate_voiceoverGenerate TTS voiceover — delegates to SDK stage.
generate_all_brollGenerate all b-roll clips — delegates to SDK stage.
generate_bgmGenerate background music — delegates to SDK stage.
merge_generationMerge parallel generation branches into a single context.
generate_talking_headsGenerate talking heads — delegates to SDK stage.
lipsync_talking_headsLip-sync talking heads — delegates to SDK stage.
mix_audioMix voiceover + BGM — delegates to SDK stage.
compose_timelineAssemble video timeline — delegates to SDK stage.
prepare_for_post_processingMap keys for post-processing — delegates to SDK stage.
burn_subtitlesBurn word-level subtitles — delegates to SDK stage.
burn_hook_overlay_safeBurn hook text overlay — delegates to SDK stage.
generate_effect_filtersUse Gemini to suggest FFmpeg filters based on content type.
apply_effectsApply AI-generated FFmpeg filters to the video.
add_outro_fadeAdd a fade-to-black outro at the end of the video.
ai-shorts-outputVideo output validation: ai-shorts-output
collect_final_outputCollect final output — delegates to SDK stage.
export_for_platformsConditionally resize + generate metadata for target platforms.
evaluate_outputScore the generated short for engagement potential.
collect_ai_shorts_outputExtract the AiShortsOutput fields from the pipeline result.

Save the YAML below as my-run.yaml, edit the values, and run with the CLI or POST it to the API. Required fields are uncommented; optional knobs are documented above the input: block — copy any line under input: and uncomment to set.

workflow: video/ai-shorts
# Optional fields — copy any line(s) under `input:` and uncomment to set:
# Public URL to a portrait image for the talking-head presenter. When provided, generates a talking-head avatar video with lip sync. When omitted, generates a faceless motion-graphics style short.
# actor_image_url: null
#
# Target audience persona: 'general', 'teens', 'professionals', 'technical', 'kids'
# audience: general
#
# Target duration in seconds (15-120)
# [min=15, max=120]
# duration_seconds: 45
#
# Alias for duration_seconds (set by normalize_input)
# duration_secs: null
#
# Specific hook line (auto-generated if empty)
# hook: ""
#
# Direct mood override (bypasses style mapping)
# mood: ""
#
# Target platform (affects pacing and framing)
# platform: TikTok
#
# Platform slugs for export variants: 'tiktok', 'instagram', 'youtube_shorts', 'youtube', 'linkedin'. When set, produces resized + metadata-enriched exports per platform in the output.
# platforms: []
#
# AI actor appearance description
# presenter_look: ""
#
# Quality preset. Default 'fast-local' keeps the script step on remote Gemini (cheap + fast) but pushes voiceover, b-roll, and music to local backends — saves ~$0.05–$0.15 per run when local backends are installed, falls through to remote per-stage when they're not. Other values: 'local' (fully local), 'local-light' (minimal local, skips video), 'local-power' (heavy local), 'fast-local-avatar', 'premium-local', 'fast', 'free', 'budget', 'cheap', 'standard', 'premium', 'ultra'. Pass an empty string to bypass the preset and let per-stage model resolution decide each model independently.
# quality: fast-local
#
# Video style: 'educational', 'promotional', 'storytelling', 'listicle'
# style: educational
#
# Override visual aesthetic (e.g. 'neon cyberpunk')
# visual_style: ""
#
# TTS voice style for narration (e.g. 'narrator', 'warm', 'energetic-female')
# voice: alloy
#
input:
# The subject of the short video
topic: ""

Run it locally:

Terminal window
fab-workflow --from-file my-run.yaml

Or submit over the wire — the same file is the request body:

Terminal window
curl -X POST 'https://gofabric.dev/v1/workflows/runs?name=video/ai-shorts' \
-H 'Authorization: Bearer fab_xxx' \
-H 'content-type: application/yaml' \
--data-binary @my-run.yaml

Every workflow also accepts the universal WorkflowInput fields — variants (1–10 fan-out) and regenerate (creative-direction hints with run lineage). See Run-specs (YAML / TOML / JSON) for the full top-level shape (metadata, priority, bundle, parent, etc.).

  • Task generate_script has no Pydantic types — contract is opaque to consumers.
  • Task merge_pre_production has no Pydantic types — contract is opaque to consumers.
  • Task reconcile_continuity has no Pydantic types — contract is opaque to consumers.
  • Task generate_ai_actor has no Pydantic types — contract is opaque to consumers.
  • Task generate_voiceover has no Pydantic types — contract is opaque to consumers.
  • Task generate_all_broll has no Pydantic types — contract is opaque to consumers.
  • Task generate_bgm has no Pydantic types — contract is opaque to consumers.
  • Task merge_generation has no Pydantic types — contract is opaque to consumers.
  • Task generate_talking_heads has no Pydantic types — contract is opaque to consumers.
  • Task lipsync_talking_heads has no Pydantic types — contract is opaque to consumers.
  • Task mix_audio has no Pydantic types — contract is opaque to consumers.
  • Task prepare_for_post_processing has no Pydantic types — contract is opaque to consumers.
  • Task burn_subtitles has no Pydantic types — contract is opaque to consumers.
  • Task burn_hook_overlay_safe has no Pydantic types — contract is opaque to consumers.
  • Task ai-shorts-output has no Pydantic types — contract is opaque to consumers.
  • Task collect_final_output has no Pydantic types — contract is opaque to consumers.