Music Workflows
Music workflows generate AI instrumentals and package them into release bundles ready for manual upload to DistroKid (and onward to Spotify, Apple Music, etc.). Phase 1 covers passive instrumental niches — sleep, study, lo-fi, ambient, white noise, focus, meditation — using fabric’s existing stages/music.py (Stable Audio Open via FAL, or local fallback). Vocal song generation (ACE-Step 1.5) is planned for Phase 3 and not yet wired.
Commercial-license clearance
Section titled “Commercial-license clearance”The release pipeline only accepts music models whose licenses permit monetized commercial use:
| Model | License | Status |
|---|---|---|
fal-ai/stable-audio | Stable Audio Open via FAL | ✅ Default |
stable-audio-open | Stability Community License (free for entities under $1M ARR) | ✅ Local fallback |
musicgen-small/medium/large | CC-BY-NC 4.0 | ❌ Blocked when commercial_only=True |
audioldm2, mustango, jasco | CC-BY-NC / research-only | ❌ Blocked |
The music/release workflow forces commercial_only=True on the underlying music stage, so non-commercial models cannot be selected even if requested. If you need a non-commercial model for non-monetized work, use stages/music.py directly.
Choosing a music backend
Section titled “Choosing a music backend”| Path | Hardware | Reliability | Cost | Recommended for |
|---|---|---|---|---|
fal-ai/stable-audio (FAL hosted) | Any | ✅ Production | ~$0.04/30s clip | All Mac users; production runs on any host |
stable-audio-open (local) | CUDA GPU | ✅ Reliable | GPU electricity | Linux/Windows with NVIDIA GPU |
stable-audio-open (local) | Mac Apple-Silicon | ⚠️ Unstable | Free | Not recommended — see below |
| ACE-Step 1.5 (Phase 3, deferred) | 6–24GB VRAM tiered | TBD | GPU electricity | Will replace stable-audio-open as the local path once Phase 3 ships |
On Mac Apple-Silicon specifically, stable-audio-open’s CosineDPMSolverMultistepScheduler hits a numerical RecursionError inside torchsde’s BrownianInterval near the end of the diffusion loop. The 100-step pipeline runs through ~99 steps and then crashes — verified locally on M-series hardware. Upstream torchsde + diffusers compatibility issue in fp32/CPU mode; CUDA fp16 works fine.
The realistic path on a Mac dev machine in 2026:
- Music gen → FAL (production-quality, ~$0.04/30s, no setup hurdles)
- Cover art → local (SDXL/Flux via fabric’s prefer-local default — works fine on Apple-Silicon)
- Wait for Phase 3 (ACE-Step 1.5) for a fully-local Mac path. ACE-Step uses a different scheduler that’s verified on Apple-Silicon CPU and runs the full 4-min track in one call (no need for the multi-segment crossfade-concat scaffold).
Running fully local (CUDA only — Mac users see table above)
Section titled “Running fully local (CUDA only — Mac users see table above)”Override audio_model to stable-audio-open — the only local model on the commercial-clearance allowlist. Cover art is already local under fabric’s prefer-local default.
# One-offfab-workflow --from-file examples/run-inputs/music-release-ambient.yaml \ --input audio_model=stable-audio-open
# Or use the committed local examplefab-workflow --from-file examples/run-inputs/music-release-ambient-local.yamlOne-time prerequisites for the local music backend
Section titled “One-time prerequisites for the local music backend”Two upfront steps before Stable Audio Open will run locally — without both, the music stage silently falls back to fal-ai/stable-audio, burning FAL credits and defeating the point of running local:
-
Install
torchsde. Stable Audio Open’sCosineDPMSolverMultistepSchedulerdepends on it. Without it, you’ll see a misleadingcannot import name 'StableAudioPipeline'error — that’s the upstream import cascade hiding the real missing dep. Install withpip install torchsde, or preferably reinstall the SDK with thelocal-musicextra:pip install -e 'sdks/fabric-workflow-sdk[local-music]'. -
Accept the HF gate. Stable Audio Open is a gated model — Stability requires you to accept their Community License before download:
- Visit
stabilityai/stable-audio-open-1.0and click “Agree and access repository”. Free; usually instant for individuals. - Authenticate your local HF cache:
huggingface-cli loginand paste a read-token from hf.co/settings/tokens. Or setHF_TOKEN=hf_xxxin your shell.
- Visit
First run downloads ~5GB of Stable Audio Open weights into ~/.cache/huggingface/. Subsequent runs are fast. Cover-art weights (~7GB SDXL or similar) are downloaded on first cover-gen run; SDXL is not gated.
Known limitation on Mac CPU: stable-audio-open’s torchsde-based scheduler hits a numerical RecursionError in fp32/CPU mode at the end of the diffusion loop. CUDA hardware works reliably; on Mac Apple-Silicon, FAL is the more reliable path until ACE-Step (Phase 3) replaces this backend.
Why no truly-ungated commercial-cleared local model exists
Section titled “Why no truly-ungated commercial-cleared local model exists”The commercial-clearance gate blocks every popular non-Stable-Audio open music model (MusicGen, AudioLDM2, Mustango, JASCO — all CC-BY-NC or research-only). Stable Audio Open is the only OSS music model in 2026 with both a commercial-friendly license and working production weights. The HF gate is the cost of admission. If a fully license-clean and ungated alternative emerges, fabric will add it to the allowlist.
Single release
Section titled “Single release”Workflow: music/release
Generates one instrumental track, masters it to -14 LUFS (Spotify normalization target), generates square cover art, and writes a DistroKid-ready bundle to output_dir/{artist}_{title}/.
fab-workflow music/release \ --input prompt="deep sleep ambient pad, slow rain underneath, 432Hz, no drums, no vocals" \ --input title="Drift Past Midnight" \ --input artist_name="Hollow Field" \ --input genre="Ambient" \ --input release_date="2026-05-21" \ --input 'songwriter_splits=[{"name":"Aneyzberg, Ari","share":100,"role":"composer","pro":"ASCAP"}]'curl -X POST "$FABRIC_URL/v1/workflows/run?name=music/release" \ -H "Authorization: Bearer $FABRIC_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "input": { "prompt": "deep sleep ambient pad, slow rain underneath, 432Hz, no drums, no vocals", "title": "Drift Past Midnight", "artist_name": "Hollow Field", "genre": "Ambient", "release_date": "2026-05-21", "songwriter_splits": [ {"name": "Aneyzberg, Ari", "share": 100, "role": "composer", "pro": "ASCAP"} ] } }'Pipeline
Section titled “Pipeline”prepare_release → generate_track → generate_cover → assemble_bundle → collect_release_outputprepare_release enforces Phase-1 invariants (no vocals, AI disclosure, songwriter splits sum to 100, auto_submit=false). generate_track calls the existing music stage with commercial_only=True. generate_cover produces square album art. assemble_bundle masters to -14 LUFS, re-encodes the cover to a 3000×3000 sRGB JPEG ≤10MB, and writes the metadata CSV + JSON sidecar.
Hard invariants
Section titled “Hard invariants”vocalsmust befalsein Phase 1 — vocal song generation via ACE-Step 1.5 is Phase 3 (deferred).ai_disclosuremust betrue— workflow-level invariant, set as a per-run default.distribution.auto_submitmust befalse— Phase 1 produces bundles for manual upload only.songwriter_splitsmust include at least one entry, and shares must sum to 100 — required for publishing royalties via your PRO (ASCAP/BMI/SESAC) and a publishing admin (Songtrust, Sentric).- Mastering target defaults to -14 LUFS integrated; runs that exceed it fail validation rather than silently get attenuated by Spotify.
Output bundle
Section titled “Output bundle”{output_dir}/{artist-slug}_{title-slug}/├── audio.wav # 16-bit / 44.1kHz stereo, mastered to -14 LUFS├── cover.jpg # 3000×3000 sRGB JPEG, ≤10MB├── metadata.csv # DistroKid bulk-upload row└── release.json # Machine-readable sidecar (full metadata + songwriter splits)The human uploader logs into DistroKid, creates a new release, fills the form by reading the CSV, and uploads audio.wav + cover.jpg. Once Spotify ingests (24–48h), claim the artist via Spotify for Artists and pitch the track to editorial playlists at least 7 days before the release date.
Hard truths
Section titled “Hard truths”- Spotify counts a play after 30s — tracks shorter than 30s earn $0 regardless of plays.
- Sub-1,000-plays-per-year tracks earn $0 (2024 monetization threshold). Long-tail catalog of duds is worthless; quality over quantity.
- Artificial-streaming penalties — distributors fine $10/track and Spotify can claw back payouts. Bot-padded plays trigger this even when the artist is innocent.
- Metadata is locked at release — changing title/artist/genre after publication breaks Spotify’s algorithmic placement. Validate before submitting.
- Without playlist placement, even a perfect track earns near-zero. Pitch via Spotify-for-Artists ≥7 days pre-release.
See plan 082 for the full set of legal, anti-fraud, operational, and discoverability considerations.