use case
Turn meetings into Markdown your agent can read.
Drop a recording. Get a clean transcript back. Let your agent summarize, extract decisions, or answer follow-ups directly.
the problem
Why this is a pain.
A 45-minute meeting recording lands in your agent's context. Native transcription isn't available — the agent sees a file it can't read. If you hand it raw audio bytes, you blow the context window before the first word is transcribed. If you transcribe it yourself with a DIY model, you're now maintaining a GPU pipeline to save 20 minutes of listening time.
Generic transcription tools return plain text that loses paragraph structure. You paste it into your agent and the agent has to guess where speaker turns happen, where topics shift, where the decisions landed. That guessing costs tokens and quality.
the workflow
How Frenchie handles it.
- 1
Drop the recording into your agent — drag a file in, paste a path, or run `/transcribe ./standup.mp3` in a Tier-A client like Claude Code or Codex. The Frenchie skill pack picks up the request and resolves the file path against your project directory, no manual upload step.
- 2
Your agent calls `transcribe_to_markdown` with the resolved path. Frenchie inspects the file, estimates credits (2 per minute), confirms your balance, and routes between the sync and async paths automatically based on duration — meetings under ~10 minutes complete inline, longer ones return a `jobId` immediately.
- 3
For long audio, Frenchie chunks the file at silence boundaries, transcribes the chunks in parallel, then stitches the result back together. Your agent isn't blocked: the smart-wait loop polls for up to 90 seconds, then hands control back so you can keep working in the same session.
- 4
When the job completes, the Markdown lands at `.frenchie/<filename>/result.md` next to the source file (stdio mode) or inline in the tool response (HTTP mode). Paragraph breaks are preserved so your agent can summarize, extract decisions, or answer follow-ups without re-reading raw audio.
example
What it looks like in practice.
Your prompt
/transcribe ./standups/2026-04-16-standup.mp3 Then pull out every decision someone committed to, grouped by person.
Agent response
Transcribed 32-minute standup — clean Markdown saved to .frenchie/2026-04-16-standup/result.md. Commitments: - **Priya**: Push the billing migration to staging by Thursday. - **Kai**: Own the spec for the Antigravity install flow, ready for review Monday. - **Lee**: Review the PR for changelog automation before EOW. - **Mia**: Draft the Q2 OKR rollup by Wednesday standup.
What lands in your workspace
$frenchie transcribe customer-call.mp3
# Transcript [00:00:12] Alex Welcome — let's walk through what shipped last week. [00:00:48] Sam Auth milestone landed Monday, dashboard rewrite went out Thursday. [00:01:24] Alex Any blockers heading into next sprint?
tips
Things worth knowing.
- Jobs run async — your agent doesn't block. For a 30-minute meeting, expect roughly 3 minutes of processing time.
- Transcription is 2 credits per minute. A one-hour weekly standup all year is about 52 × $1.20 = $62.40.
- The result expires 30 minutes after first delivery. If your agent needs to revisit the Markdown later, ask for a fresh transcription or save the result to your own workspace.
questions
Common questions.
What audio formats work?
MP3, M4A, WAV, MP4, MOV, WEBM. If it plays, Frenchie can transcribe it. Max 2 GB per file.
Does it handle speaker labels?
No — Frenchie returns a single Markdown transcript. Your agent can infer speakers from context, or you can ask it to guess from names mentioned. For enforced speaker diarization, use a dedicated transcription API.
What about non-English meetings?
Thai, Japanese, Chinese, Arabic, French, Spanish, and 50+ other languages work out of the box. Mixed-language meetings work too — no flags needed.
Can I transcribe Zoom / Google Meet recordings?
Yes. Download the recording, drop it in your agent. The audio format won't matter.
Try it with a real recording of yours.
100 free credits on signup. No card. Drop an audio or video file from your own workflow and see the transcript your agent gets back.