Research Workflows

Fabric’s research workflows discover problems, validate opportunities, and generate execution-ready business playbooks. Three core workflows compose into a pipeline that transforms raw platform content into actionable startup plans.

Pipeline Overview

research/problem-intelligence     Discover problems across platforms
         │
         │ ideas.0, ideas.1, ...
         │
    ┌────┴────┐
    │         │
research/  research/
playbook   idea-deep-dive

Workflow	Purpose	Output
`research/problem-intelligence`	Cross-platform problem discovery	Ranked clusters, ideas, specs
`research/playbook`	Execution-ready business playbook	Single opinionated playbook
`research/idea-deep-dive`	Broad market analysis	Market validation, pricing, GTM, revenue

Problem Intelligence

Workflow: research/problem-intelligence

Ingests content from 8 platforms: Reddit, Hacker News, YouTube, YouTube Shorts, TikTok, Instagram, Twitter, and forums. Extracts problems, normalizes and deduplicates, clusters by semantic similarity, detects existing solutions, scores opportunities deterministically, generates startup ideas, and produces SaaS specs.

Quick start

# Discover problems in a niche
fab-workflow research/problem-intelligence \
  --input niche="developer tools" \
  -o output/devtools.json

# Reddit discovery mode (auto-discover and crawl subreddits)
fab-workflow research/problem-intelligence \
  --input niche="project management" \
  --input reddit_discovery=true \
  --input depth=25 \
  -o output/pm-deep.json

# Monitor specific subreddits (no search query needed)
fab-workflow research/problem-intelligence \
  --input 'subreddits=["SaaS","startups","Entrepreneur"]' \
  -o output/saas-reddit.json

# Re-run with Reddit discovery on previous output
fab-workflow research/problem-intelligence \
  --input-file output/devtools.json \
  --input reddit_discovery=true \
  --input depth=25 \
  --input 'platforms=["reddit"]' \
  -o output/devtools-deep.json

Pipeline

plan_ingestion
       │
  ┌────┼──────────────────────────────────────┐
reddit  hackernews  youtube  shorts  tiktok  instagram  twitter  forums
  └────┼──────────────────────────────────────┘
       │
  merge_sources
       │
  extract_problems          (parallelized LLM batches)
       │
  normalize_problems        (text cleanup + embedding dedup)
       │
  cluster_problems          (cosine similarity > 0.88)
       │
  aggregate_solutions       (detect existing solutions)
       │
  compute_scores            (deterministic, no LLM)
       │
  generate_ideas            (1-3 per top cluster)
       │
  score_ideas → generate_specs → format_output

Input

Parameter	Type	Default	Description
`query`	`string`	—	Search query (optional if `niche` or `subreddits` provided)
`niche`	`string`	—	Domain to focus on (auto-derives query)
`subreddits`	`list[str]`	`[]`	Specific subreddits to monitor
`forum_urls`	`list[str]`	`[]`	Forum URLs to ingest
`platforms`	`list[str]`	all 8	Platforms to query: `reddit`, `hackernews`, `youtube`, `youtube_shorts`, `tiktok`, `instagram`, `twitter`, `forums`
`depth`	`int`	`10`	Results per platform/subreddit
`reddit_discovery`	`bool`	`false`	Auto-discover relevant subreddits
`max_subreddits`	`int`	`15`	Max subreddits to discover
`reddit_concurrency`	`int`	`3`	Concurrent subreddit crawls
`extract_concurrency`	`int`	`3`	Concurrent LLM extraction batches
`top_clusters`	`int`	`10`	Clusters to generate ideas for
`top_ideas`	`int`	`5`	Ideas to generate specs for

At least one of query, niche, subreddits, or forum_urls is required.

Reddit Discovery Mode

When reddit_discovery=true, the adapter auto-discovers relevant subreddits, crawls each in parallel across hot/top/new sort orders, and fetches full comment threads. Uses rotating browser user agents.

fab-workflow research/problem-intelligence \
  --input niche="developer tools" \
  --input reddit_discovery=true \
  --input depth=25 \
  --input max_subreddits=20 \
  --input reddit_concurrency=5 \
  -o output/devtools-deep.json

Scoring

All opportunity scores are deterministic (no LLM) and bounded [0, 1]:

pain        = frustration*0.4 + urgency*0.3 + (1 - sentiment)*0.3
demand      = log(mentions+1)/log(max+2)*0.6 + log(users+1)/log(max+2)*0.4
growth      = sigmoid(velocity * 2)
solution_gap = dissatisfaction*0.7 + (1 - solution_density)*0.3
recency     = exp(-days_since_last / 7)

opportunity = pain*0.30 + demand*0.25 + growth*0.15 + gap*0.20 + recency*0.10

Output

{
  "workflow": "research/problem-intelligence",
  "scope": "niche",
  "niche": "developer tools",
  "ranked_clusters": [
    {
      "id": "cluster_0",
      "canonical_problem": "...",
      "opportunity_score": 0.82,
      "mention_count": 12,
      "unique_users": 8,
      "avg_frustration": 0.75
    }
  ],
  "discovered_subreddits": [
    {"name": "webdev", "subscribers": 3214996}
  ],
  "source_documents": [],
  "problem_mentions": [],
  "clusters": [],
  "solution_mentions": [],
  "ideas": [],
  "specs": []
}

Idea Deep Dive

Workflow: research/idea-deep-dive

Broad market analysis for due diligence before committing to an idea. Runs 8 sequential LLM calls covering market validation, product variants, pricing, landing page, content strategy, go-to-market, and revenue projections.

Quick start

# From problem-intelligence output (pick top idea)
fab-workflow research/idea-deep-dive \
  --input-file output/devtools.json --pick "ideas.0" \
  -o output/deep-dive.json

# Pick a different idea
fab-workflow research/idea-deep-dive \
  --input-file output/devtools.json --pick "ideas.1" \
  -o output/deep-dive-idea2.json

# Manual input
fab-workflow research/idea-deep-dive \
  --input name="TaskFlow" \
  --input description="AI project management for solo devs" \
  --input target_persona="indie developers" \
  -o output/deep-dive.json

Pipeline

prepare_idea → validate_market → generate_product_variants →
research_pricing → design_landing_page → plan_content_strategy →
plan_go_to_market → project_revenue → format_output

What it produces

Section	Contents
`market_validation`	TAM/SAM/SOM, 5-10 competitors, timing, audience deep dive, demand signals
`product_variants`	All viable forms ranked (SaaS, mobile, extension, templates, course, community, etc.)
`pricing_research`	Willingness to pay, 3-4 tiers, pricing psychology, estimated ARPU
`landing_page`	Hero, problem, solution, features, social proof, FAQ — with actual copy
`content_strategy`	10 blog posts, 5 videos, 3 lead magnets, social plan, email sequence, SEO keywords
`go_to_market`	Pre-launch, launch week day-by-day, first 100 users, first 1000 users, partnerships
`revenue_projections`	3 scenarios with month-by-month MRR, breakeven timeline, LTV/CAC

Playbook Generation

Workflow: research/playbook

Generates a high-fidelity, execution-ready business playbook via 8 parallel LLM calls. Includes naming, MVP strategy, build spec, monetization, 14-day social schedule across 6 platforms (Twitter, Reddit, LinkedIn, TikTok, Instagram, YouTube Shorts), day-by-day execution plan, success metrics, and launch assets with ready-to-post content for every platform.

Quick start

# From problem-intelligence output (pick top idea)
fab-workflow research/playbook \
  --input-file output/devtools.json --pick "ideas.0" \
  -o output/playbook.json

# Pick a different idea
fab-workflow research/playbook \
  --input-file output/devtools.json --pick "ideas.2" \
  -o output/playbook-idea3.json

# From a deep Reddit crawl
fab-workflow research/playbook \
  --input-file output/devtools-deep.json --pick "ideas.0" \
  -o output/playbook-deep.json

# Manual input (no prior workflow needed)
fab-workflow research/playbook \
  --input name="TaskFlow" \
  --input description="AI project management for solo devs" \
  --input target_persona="indie developers" \
  -o output/playbook.json

What it produces

Section	Contents
`name_suggestions`	8-10 brandable product names
`opportunity`	Problem, persona, why now, market signals
`evidence`	Summary, real user quotes, pain signals
`product`	Positioning, value prop, core features, MVP scope, what NOT to build
`mvp_strategy`	Where to start, day-1 version, week-1 version, validation question
`build`	Complexity, time to MVP, stack, file structure, DB schema, API endpoints, unknowns
`monetization`	Model, price points, competitor pricing, when/how to charge, free tier strategy
`success_metrics`	Day-1/week-1/month-1 signals, activation metric, failure signals, revenue target
`distribution`	14-day social schedule across 6 platforms, email templates, DM scripts, Fabric automation
`execution_plan`	Day-by-day plan (day 1 through day 7, then weeks 2-4) with hourly time estimates
`extensions`	What to build next, automation opportunities, feature/platform expansion
`risks`	Product, market, execution risks with likelihood and mitigation
`assets`	Landing page (hero, features, FAQ), launch posts for Twitter/Reddit/LinkedIn/TikTok/Instagram/YouTube Shorts, pitch

Design principles

Grounded — all claims tie back to real problem data and user quotes
Opinionated — makes decisions, chooses positioning, recommends strategy
Executable — everything answers “can someone act on this immediately?”
MVP-disciplined — aggressively limits scope, includes explicit “what not to build”

Piping Workflows with —input-file

All research workflows accept --input-file to load a previous workflow’s output. Combined with --pick (dot-notation path) and --map (rename fields), this enables chaining without glue code.

Pick an idea and generate a playbook

fab-workflow research/problem-intelligence \
  --input niche="developer tools" -o output/devtools.json

fab-workflow research/playbook \
  --input-file output/devtools.json --pick "ideas.0" \
  -o output/playbook.json

Re-run with Reddit discovery

fab-workflow research/problem-intelligence \
  --input-file output/devtools.json \
  --input reddit_discovery=true \
  --input depth=25 \
  --input 'platforms=["reddit"]' \
  -o output/devtools-deep.json

Generate playbooks for all top ideas

for i in 0 1 2 3 4; do
  fab-workflow research/playbook \
    --input-file output/devtools.json --pick "ideas.$i" \
    -o "output/playbook-idea-$i.json"
done

How —pick works

Pick path	Extracts
`ideas.0`	First idea object
`ideas.2`	Third idea object
`ranked_clusters.0`	Top-ranked cluster
`specs.0`	First SaaS spec

The extracted value becomes the input for the next workflow. Any --input K=V flags merge on top.

Choosing a Downstream Workflow

Use case	Workflow	Why
Start building today	`research/playbook`	Opinionated, concrete execution plan, launch assets
Validate before committing	`research/idea-deep-dive`	Market analysis, competitors, pricing, revenue
Both	Run both	They accept the same input format

Other Research Workflows

Deep Research

Workflow: research/deep_research

Multi-source fan-out research with parallel web, YouTube, Reddit, and RSS search.

fab-workflow research/deep_research \
  --input query="Why are developers switching to Rust" \
  -o output/rust-research.json

Hot Topics

Workflow: research/hot_topics

Scans trending topics across categories with content opportunity scoring.

fab-workflow research/hot_topics \
  --input categories='["tech", "business"]' \
  -o output/hot-topics.json

Competitive Intelligence

Workflow: research/competitive

Analyzes competitor content strategy to find gaps.

fab-workflow research/competitive \
  --input competitors='["@mkbhd", "@fireship"]' \
  --input niche="tech education" \
  -o output/competitive.json

Trend Research

Workflow: research/trends

Multi-signal trend analysis with content idea generation.

fab-workflow research/trends \
  --input niche="AI agents" \
  --input platform="TikTok" \
  -o output/trends.json

Content Discovery

Workflow: research/discovery

Finds high-performing content in a niche for repurposing.

fab-workflow research/discovery \
  --input niche="productivity" \
  --input platform="YouTube" \
  -o output/content-discovery.json

Platform Export

Workflow: video/export-platforms

Takes a finished 9:16 video and produces platform-ready exports with resize, duration trimming, and LLM-generated metadata.

# Export a video for multiple platforms
fab-workflow video/export-platforms \
  --input media_path=output/short.mp4 \
  --input topic="AI productivity tools" \
  --input 'platforms=["tiktok","instagram","youtube_shorts"]' \
  -o output/exports.json

Supported platforms

Platform	Aspect	Resolution	Max Duration	Metadata
`tiktok`	9:16	1080x1920	180s	Description (150 char), hashtags, sound suggestion
`instagram`	9:16	1080x1920	90s	Caption (2200 char), 20-30 hashtags, cover frame
`youtube_shorts`	9:16	1080x1920	60s	Title, description, tags, thumbnail text
`youtube`	16:9	1920x1080	unlimited	Title, description, 10-15 tags, thumbnail
`instagram_feed`	1:1	1080x1080	60s	Caption, 20-30 hashtags
`linkedin`	9:16	1080x1920	600s	Professional post text, 3-5 hashtags

Aspect ratio changes use blur-fill (blurred background, no black bars). Videos exceeding platform max duration are trimmed.

Platform Adapters (Ingestion)

Platform	API	Auth Required	Rate Limit
Reddit	Public JSON API (old.reddit.com)	No	2s between calls
Hacker News	Algolia API	No	0.5s between calls
YouTube	yt-dlp + comments	No	—
YouTube Shorts	yt-dlp + `#shorts` filter (90s max)	No	—
TikTok	yt-dlp `tiktoksearch` or DuckDuckGo fallback	No	—
Instagram	Graph API or DuckDuckGo + Jina Reader fallback	`INSTAGRAM_ACCESS_TOKEN` for Graph API	2s between calls
Twitter	API v2 or DuckDuckGo fallback	`TWITTER_BEARER_TOKEN` for API v2	—
Forums	Jina Reader	No	—

Environment Variables

Variable	Required	Description
`GEMINI_API_KEY`	Yes	LLM extraction, ideation, spec generation, playbook, deep dive
`OPENAI_API_KEY`	No	Embeddings (falls back to sentence-transformers locally)
`TWITTER_BEARER_TOKEN`	No	Twitter API v2 (falls back to web search)
`INSTAGRAM_ACCESS_TOKEN`	No	Instagram Graph API (falls back to web search)

CLI Flags

Flag	Description
`-o FILE`	Save output JSON to file
`-v` / `--verbose`	Print full JSON to terminal
`--input K=V`	Set input key-value (repeatable)
`--input-file FILE`	Load input from JSON file
`--pick PATH`	Extract nested field from input file
`--map TO=FROM`	Rename fields from input file
`--model MODEL`	Override LLM model
`--quality PROFILE`	Quality profile (local, cheap, premium)