Media Workflows
Media workflows handle the core video and audio processing primitives — transcription, reframing, subtitles, and format conversion. They are used as building blocks by higher-level pipelines (AI Shorts, Clip Generator, YouTube Studio).
Transcription
Section titled “Transcription”Workflow: media/transcription
Downloads video/audio from a URL or local path, then produces word-level transcripts via Faster Whisper.
# From YouTubefabric run media/transcription --input url="https://youtube.com/watch?v=..."
# From local filefabric run media/transcription --input media_path="/path/to/video.mp4"Pipeline
Section titled “Pipeline”download_media → transcribe_audio| Parameter | Type | Default | Description |
|---|---|---|---|
url / video_url / media_url | string | — | URL to download (HTTP or YouTube) |
media_path / video_path | string | — | Local file path |
whisper_model | string | "large-v3" | Faster Whisper model size |
language | string | auto-detect | Language code (e.g., "en") |
Output
Section titled “Output”{ "media_path": "/tmp/downloaded.mp4", "full_text": "The complete transcribed text...", "transcript": [ { "start": 0.0, "end": 2.5, "text": "Hello and welcome", "words": [ {"word": "Hello", "start": 0.0, "end": 0.4, "probability": 0.98}, {"word": "and", "start": 0.5, "end": 0.6, "probability": 0.95}, {"word": "welcome", "start": 0.7, "end": 1.2, "probability": 0.97} ] } ]}Vertical Reframe
Section titled “Vertical Reframe”Workflow: media/reframe
Converts landscape (16:9) video to vertical (9:16) with intelligent subject tracking using MediaPipe face detection and YOLOv8 person detection.
fabric run media/reframe --input media_path="/path/to/landscape.mp4"Pipeline
Section titled “Pipeline”detect_and_track_subjects → crop_to_vertical| Parameter | Type | Default | Description |
|---|---|---|---|
media_path | string | required | Path to landscape video |
Output
Section titled “Output”{ "vertical_path": "/tmp/vertical_9x16.mp4", "video_width": 1080, "video_height": 1920, "fps": 30, "total_frames": 900, "tracking_data": { "faces": [...], "persons": [...] }}Subtitles
Section titled “Subtitles”Workflow: media/subtitles
Aligns word-level timestamps into subtitle chunks and burns them into the video using FFmpeg ASS rendering.
fabric run media/subtitles \ --input media_path="/path/to/video.mp4" \ --input subtitle_max_chars=40Pipeline
Section titled “Pipeline”align_words → render_subtitles| Parameter | Type | Default | Description |
|---|---|---|---|
transcript | list[dict] | required | Word-level transcript from transcription |
media_path | string | required | Video to burn subtitles into |
subtitle_max_chars | int | 40 | Max characters per subtitle line |
Output
Section titled “Output”{ "subtitles": [ {"start": 0.0, "end": 2.5, "text": "Hello and welcome"} ], "subtitled_path": "/tmp/subtitled.mp4"}Transcode
Section titled “Transcode”Workflow: media/transcode
FFmpeg-based format and codec conversion with configurable quality.
fabric run media/transcode \ --input media_path="/path/to/input.mov" \ --input codec="libx265" \ --input crf=20| Parameter | Type | Default | Description |
|---|---|---|---|
media_path | string | required | Input video path |
codec | string | "libx264" | Video codec (libx264, libx265) |
format | string | "mp4" | Output format |
crf | int | 23 | Quality (lower = better, 0-51) |
preset | string | "medium" | Speed/quality tradeoff (ultrafast to veryslow) |
resolution | string | original | Target resolution (e.g., "1920x1080") |
audio_codec | string | "aac" | Audio codec |
audio_bitrate | string | "128k" | Audio bitrate |
Multi-Format Resize
Section titled “Multi-Format Resize”Workflow: media/resize
Produces multiple aspect ratio variants for cross-platform distribution.
fabric run media/resize \ --input media_path="/path/to/video.mp4" \ --input formats='["1:1", "16:9", "9:16", "4:5"]' \ --input strategy="pad"| Parameter | Type | Default | Description |
|---|---|---|---|
media_path | string | required | Input video |
formats | list[str] | ["9:16"] | Target aspect ratios |
strategy | string | "pad" | Resize strategy: pad (blurred fill), crop, or fit |
Output
Section titled “Output”{ "resized_variants": [ {"format": "1:1", "path": "/tmp/square.mp4", "dimensions": [1080, 1080]}, {"format": "9:16", "path": "/tmp/vertical.mp4", "dimensions": [1080, 1920]} ]}