Keyframe Grid Generation
The keyframe grid generates 4 scenes in a single 2x2 image, then crops them into individual keyframes. Because all panels share one generation pass, the model maintains consistent lighting, color, and style across them. These keyframes then serve as visual anchors for downstream image-to-video generation.
This technique is inspired by the TechHalla/Atlabs 2x2 Grid Method adapted for automated video pipelines.
Why a Grid?
Section titled “Why a Grid?”The AI Shorts pipeline generates b-roll segments independently — each as a separate API call. Even with a shared continuity brief (text prefix for consistent palette/grain), each generation starts from scratch. This can produce subtle visual drift between segments.
The grid solves this by generating multiple scenes simultaneously:
┌───────────────────┬───────────────────┐│ │ ││ Hook scene │ Problem scene ││ (segment 0) │ (segment 1) ││ │ │├───────────────────┼───────────────────┤│ │ ││ Solution scene │ CTA scene ││ (segment 2) │ (segment 3) ││ │ │└───────────────────┴───────────────────┘All 4 panels are forced to share the same color temperature, lighting geometry, and film texture because they exist in the same diffusion context.
How It Works
Section titled “How It Works”-
Segment selection — Up to 4 b-roll segments are selected by narrative priority: hook > solution > cta > problem > insight > demo > proof.
-
Grid prompt construction — A single prompt describes all 4 panels, with the continuity brief prepended:
Unified palette: warm amber with teal shadows. Consistent Kodak Vision3 500T grain.Generate a single 2x2 grid image containing exactly 4 panels arranged in asquare layout, separated by thin white borders. Each panel shows a differentscene from the same visual universe — identical color palette, lightingtemperature, and film grain texture across all panels.Top-left panel: Dramatic wide shot of a city skyline at golden hour...Top-right panel: Close-up of hands typing on a keyboard...Bottom-left panel: Person standing confidently in front of a whiteboard...Bottom-right panel: Aerial view pulling back to reveal the full landscape... -
Image generation — The grid is generated as a single 1:1 aspect ratio image using the configured image model (Imagen, SDXL Turbo, Flux Schnell, etc.).
-
Cropping — The grid image is split into 4 individual keyframe PNG files via PIL.
-
Downstream use — When image-to-video models are used for b-roll (e.g., Kling v3 i2v), the pre-generated keyframe replaces the per-segment still image. This means all b-roll videos start from visually consistent reference frames.
Enabling Keyframe Grid
Section titled “Enabling Keyframe Grid”Option 1: Explicit opt-in
Section titled “Option 1: Explicit opt-in”fabric run global/ai-shorts \ --input topic="AI productivity" \ --input use_keyframe_grid=trueOption 2: Quality profile
Section titled “Option 2: Quality profile”Profiles with a non-skip keyframe_grid model enable it automatically:
| Profile | Grid Model | Enabled? |
|---|---|---|
| (default) | imagen-4.0-fast-generate-001 | Yes |
premium | imagen-4.0-fast-generate-001 | Yes |
local | sdxl-turbo | Yes |
local-power | flux-schnell | Yes |
cheap | skip | No |
local-light | skip | No |
Option 3: Override model per-run
Section titled “Option 3: Override model per-run”fabric run global/ai-shorts \ --input topic="AI productivity" \ --input keyframe_grid_model="flux-schnell"Edge Cases
Section titled “Edge Cases”| Scenario | Behavior |
|---|---|
| 1 b-roll segment | Grid skipped — not enough panels to benefit from shared context |
| 2 b-roll segments | Side-by-side (1x2) layout at 16:9, cropped into 2 keyframes |
| 3 b-roll segments | 2x2 grid with 4th panel as an establishing shot derived from the topic |
| 4 b-roll segments | Standard 2x2 grid |
| 5+ b-roll segments | Multiple grids in batches of 4, all sharing the same continuity brief |
| Grid generation fails | Returns empty map — downstream falls back to per-segment still generation |
| No i2v model | Keyframes are generated but unused by video gen; available as storyboard artifacts |
Continuity Brief + Grid: How They Complement
Section titled “Continuity Brief + Grid: How They Complement”| Continuity Brief (text) | Keyframe Grid (visual) | |
|---|---|---|
| What it does | Prepends palette/film-stock/atmosphere text to every prompt | Generates reference images from a single shared generation pass |
| Consistency type | Color grading, grain texture, atmospheric feel | Pixel-level lighting geometry, character details, environment |
| When it helps | Always — works with any model | Most valuable with image-to-video models |
| Cost | Zero — pure logic, no API call | One image generation per 4 segments |
| Latency | None | ~10-30s depending on model |
Both are applied together. The grid prompt includes the continuity brief, and downstream video prompts still get the brief prepended regardless.
SDK API
Section titled “SDK API”The grid module is importable for use in custom workflows:
from fabric_workflow_sdk._keyframe_grid import ( generate_keyframe_grid, select_grid_segments, build_grid_prompt, crop_grid_to_keyframes,)
# Full orchestrationkeyframe_map = await generate_keyframe_grid( input_dict, segments=script["segments"], continuity_brief=script["_continuity_brief"], topic="AI productivity",)# keyframe_map = {0: "/tmp/kf_0.png", 3: "/tmp/kf_1.png", ...}
# Or use individual functionsselected = select_grid_segments(segments, max_panels=4)prompt = build_grid_prompt(selected, continuity_brief, topic="AI")# Generate grid image with your preferred method...keyframes = crop_grid_to_keyframes("/tmp/grid.png", num_panels=4)