Skip to content

audio.transcribe

Operation: audio.transcribe
Category: media
Tags: audio, whisper, transcription, speech-to-text

Transcribe audio/video to text with word-level timestamps via Whisper

Type: Native (built-in)
Timeout: 600s
Retries: 2 (ExponentialWithJitter) Memory: 4096 MB

NameTypeRequiredDefaultDescription
audio_urlAssetNoURL or file:// path to an audio file
audio_pathStringNoLocal filesystem path to an audio file
video_pathStringNoLocal path to a video file (audio will be extracted)
NameTypeDescription
textStringFull transcript text
segmentsArray<JSON>Array of {start, end, text} segment objects
wordsArray<JSON>Array of {start, end, word} word-level timestamp objects
languageStringDetected language code
duration_secsNumberAudio duration in seconds
{
"model": "large-v3",
"prefer_local": true,
"word_timestamps": true
}
import { WorkflowBuilder } from "@fabric-platform/sdk";
const workflow = new WorkflowBuilder("my-workflow")
.node("audio-transcribe", "tool", (n) =>
n.config({
operation: "audio.transcribe",
// ... node-specific config
})
)
.build();