the basics

General

Originally from the homepage.

What can my agent do with Frenchie?

Three things: OCR (read PDFs and images → Markdown), transcription (audio and video → Markdown), and image generation (text prompts → image files). All through one MCP interface. One install, three capabilities.

What file types do you support?

PDF, PNG, JPG, WebP for OCR. MP3, M4A, WAV, MP4, MOV, WEBM for transcription. Text prompts for image generation. If it's a file or a prompt, we probably handle it.

How good is the OCR?

Good enough to read scanned contracts with tables, handwritten notes, and that one PDF your client exported from a fax machine in 2003. Seriously — try the free credits and compare.

How much does image generation cost?

20 credits per generated image. Since $1 = 100 credits, that's $0.20 per image. Flat rate, no model math, no quality tier pricing. Your AI agent calls the generate_image MCP tool and the image lands in your working directory.

What image sizes and formats does generate_image return?

PNG, JPEG, or WebP files at the size your agent requests — 1024×1024, 1536×1024, 1024×1536, or auto. The file is auto-saved next to your work in stdio mode, or returned as a short-lived presigned URL for hosted agents. No watermark, no attribution requirement.

Can I pick the image model on Frenchie?

No. Frenchie ships one curated production image model and handles upgrades behind the scenes. If you need to pick between many models (FLUX, SDXL, SD3, niche fine-tunes), a platform like Replicate or Fal.ai is the better fit — we compare them honestly on /compare.

Can my agent read PDFs natively? Why do I need this?

It can — but scanned PDFs come back empty, and large ones eat your tokens alive. Frenchie parses outside your agent and returns clean text. Fewer tokens, better results.

What happens to my files and generated images?

Processed and deleted. We don't store your files or generated images after delivery. The result payload expires 30 minutes after first retrieval.

Is there a free tier?

100 credits on your first signup, once per email, no credit card required. That's 100 pages of OCR, 50 minutes of transcription, or 5 generated images — mix and match however you want. Enough to know if it works for you.

I can build this myself.

You absolutely can. If you're parsing more than 5,000 pages a month or generating hundreds of images a day, you probably should. For everyone else, that's days of setup and ongoing maintenance for something that costs a dollar.

What agents does it work with?

Any MCP client that can spawn a local stdio server. Tier-A auto-install supports Claude Code, Cursor, Codex, Antigravity, and Claude Desktop. Other MCP clients can be wired up manually with the same skill pack. Hosted/web agents (Lovable, Manus, Claude.ai, ChatGPT.com) connect to https://mcp.getfrenchie.dev over HTTP instead.

What's coming next?

We keep the public promise tight: OCR for PDFs and images, transcription for audio and video, image generation from text prompts — all delivered over MCP. When we ship something new, it lands in the changelog first.

Does Frenchie work with non-English content?

Yes. OCR, transcription, and image generation all handle Thai, Japanese, Chinese, Arabic, French, Spanish, and 50+ other languages out of the box. Mixed-language documents and multilingual prompts work too. No flags, no config — just send it.

How does Frenchie compare to Marker, LlamaParse, Whisper, or Replicate?

They're all good tools. Marker and LlamaParse are libraries — you run them yourself. Whisper is a model — you wire it up yourself. Replicate is a model hub — you pick and glue the models. Frenchie is an MCP server — your agent calls it directly for OCR, transcription, and image generation, no glue code, no ops. If you're running heavy volume, DIY wins on cost. If you want it to just work inside Claude Code or Cursor tomorrow, that's us.

What's the maximum file size?

2 GB per file. For PDFs that's thousands of pages. For audio and video that covers most recordings. Image generation has no file-size limit — it's prompt-in, image-out. If you need more, email support@getfrenchie.dev — we'll figure it out.

Is Frenchie open source? Can I self-host?

No on both. Frenchie is a managed service — you send files or prompts, we return Markdown or images. No infrastructure to run, no models to host. The MCP integration code on your side is fully in your hands; the extraction and generation pipelines stay with us.

How long does a job take?

OCR is usually a few seconds per page. Transcription runs around a tenth of the audio length — a 30-minute meeting lands in about three minutes. Image generation is typically 20–60 seconds per image. Larger jobs process async: your agent kicks off the job, keeps working, and collects the result when it's ready.

money

Pricing & billing

Originally from the pricing page.

How do I pay?

Top up from the dashboard with any major card — Visa, Mastercard, Amex. Minimum $1. You only pay for what you use.

Do I get a receipt or invoice?

Yes. Every top-up generates a receipt emailed to your account address. The dashboard keeps a full history you can export any time.

Do credits expire?

No. Top up once, use whenever. A credit you bought today still works a year from now.

How much does each capability cost?

One credit per OCR page ($0.01), 2 credits per transcription minute ($0.02), and 20 credits per generated image ($0.20). Same credit pool across all three — your AI agent can mix OCR, transcription, and image generation however it needs, billed from the same balance.

How much does image generation cost on Frenchie?

20 credits per image, which is $0.20 per generated image at the flat $1 = 100 credits rate. No model-tier pricing, no quality surcharges — whatever prompt your agent sends, it's 20 credits.

What happens if a job fails?

Failed jobs don't bill. If an OCR, transcription, or image generation job fails partway through, the credits are refunded to your balance automatically — no support ticket needed.

Can I refund unused credits?

Credits are non-refundable once purchased. That's why we ship 100 free credits on signup — try before you top up. See /terms for the full policy.

Processing thousands of pages a month?

Email support@getfrenchie.dev with a rough volume estimate. We'll figure out the right setup for your workload — we're small enough to actually reply.

Is there a free tier?

Yes. 100 credits on your first signup, once per email, no card required. That's 100 OCR pages, 50 transcription minutes, or 5 generated images — mix and match however you want.

comparison

Frenchie vs Marker

Originally from the Marker comparison page.

Is Frenchie cheaper than running Marker?

At small-to-medium volume, Frenchie is cheaper once you count your time — you don't run infrastructure. At high volume (thousands of pages a day, predictable load), a well-tuned Marker deployment wins on unit economics.

Can I switch from Marker to Frenchie?

Yes. Frenchie has no lock-in on your side — you call an MCP tool, you get Markdown. If you later want to move off, your code just calls a different tool. We don't hold onto files or training data.

Does Frenchie give me the same structured output as Marker?

Clean Markdown, preserved table structure, extracted figures as PNGs, and page breaks. For most downstream workflows — agents, RAG indexes, human review — the two are interchangeable on output shape.

Why not just use Marker?

Run it if your workflow is batch and you want full control. Call Frenchie if your workflow is agent-driven and you want zero ops. Those are different shapes — pick the one that matches yours.

comparison

Frenchie vs LlamaParse

Originally from the LlamaParse comparison page.

Is Frenchie cheaper than LlamaParse?

Frenchie is flat: $0.01 per OCR page. LlamaParse varies by accuracy tier. At base tier, pricing is in the same ballpark. At higher tiers, LlamaParse is more expensive per page but offers extraction features Frenchie doesn't.

Can I use Frenchie inside a LlamaIndex app?

Yes — call Frenchie via its MCP tool or HTTP endpoint, feed the Markdown into your existing LlamaIndex indexer. No custom loader required.

Does Frenchie do structured extraction like LlamaParse?

No. Frenchie returns clean Markdown. Structured extraction — pulling specific fields, applying schemas — is something your agent handles after reading the Markdown, using the LLM it already has.

Why not just use LlamaParse?

Use it if your stack is LlamaIndex and you want parsing + retrieval coupled. Use Frenchie if your stack is MCP and you want parsing as a standalone tool call.

comparison

Frenchie vs Whisper

Originally from the Whisper comparison page.

Is Frenchie cheaper than running Whisper?

At low volume, Frenchie is cheaper — you don't run hardware. At high sustained volume, a well-tuned self-hosted Whisper wins on unit cost but requires ops investment.

Can I switch from Whisper to Frenchie?

Yes. You call an MCP tool or HTTP endpoint, you get Markdown. No lock-in on our side — files are deleted after delivery, no training data kept.

Does Frenchie handle long audio the way Whisper does?

Yes. Frenchie chunks long audio automatically, transcribes in parallel, merges the result — async so your agent doesn't block while the job runs. Max 2 GB per file.

Why not just use Whisper?

Run it if you have GPUs and ops appetite. Call Frenchie if you want transcription as a zero-ops tool your agent can invoke directly.

comparison

Frenchie vs AssemblyAI

Originally from the AssemblyAI comparison page.

Does Frenchie have speaker diarization?

No. Frenchie delivers a single Markdown transcript. If your product needs labeled speakers, AssemblyAI or a similar audio-intelligence API is the right tool.

Is Frenchie cheaper than AssemblyAI?

At the base transcription rate, Frenchie is cheaper per minute. Once you turn on audio intelligence add-ons (sentiment, entities, moderation), AssemblyAI adds per-feature fees. Compare apples to apples on the features you actually use.

Can I get summaries from Frenchie?

Not directly — Frenchie returns Markdown, your agent summarizes. Since you're already paying an LLM for the agent, summarization is a prompt away.

Why not just use AssemblyAI?

Use AssemblyAI when audio intelligence is the product. Use Frenchie when clean transcription is a step inside a larger agent conversation.

comparison

Frenchie vs Deepgram

Originally from the Deepgram comparison page.

Does Frenchie do realtime transcription?

No. Frenchie is batch async — you submit a file, the job runs in the background, and the Markdown comes back when it's ready. For live streaming, Deepgram (or a similar streaming API) is the right choice.

Is Frenchie cheaper than Deepgram?

At Deepgram's standard per-minute rates, Frenchie is in a similar ballpark — the real difference is commit structure and add-ons. Frenchie is flat with no minimums; Deepgram gets cheaper at enterprise commit tiers.

Can I use Deepgram for live calls and Frenchie for recordings?

Yes, and it's a clean split. Stream the live call through Deepgram, save the recording, hand it to Frenchie for the agent-facing Markdown transcript after the call.

Why not just use Deepgram?

Use Deepgram when latency and streaming matter. Use Frenchie when the workflow is batch and your agent is the caller.

comparison

Frenchie vs Replicate

Originally from the Replicate comparison page.

Can my agent call Replicate through Frenchie?

No. Frenchie is its own MCP tool — call `generate_image` and you get an image back. If you want Replicate-specific models, use Replicate's API directly (or wrap it in your own MCP server).

Is Frenchie cheaper than Replicate?

Frenchie is flat: $0.20 per image. Replicate varies by model and GPU-seconds. For a fast model on a cheap GPU, Replicate can be cheaper per image; for a premium model on an H100, Frenchie is often cheaper and simpler.

Can I pick the image model on Frenchie?

No. Frenchie ships one production image model and handles upgrades behind the scenes. If model choice is the feature you're building on, Replicate is the better fit.

Why not just use Replicate?

Use Replicate when model variety is a feature of your product and you have engineering time to integrate. Use Frenchie when you want your agent to generate an image as a single MCP tool call with zero setup.

comparison

Frenchie vs Fal.ai

Originally from the Fal.ai comparison page.

Is Frenchie as fast as Fal.ai?

No. Fal is latency-optimized; Frenchie isn't. A Frenchie image typically comes back in a few seconds, which is plenty fast for agent workflows but not competitive with Fal for real-time consumer UX.

Is Frenchie cheaper than Fal.ai?

Frenchie is flat $0.20 per image. Fal varies by model and tier — some models are cheaper, premium models cost more. Compare on your exact model before deciding.

Can I use both Fal and Frenchie?

Yes — they're complementary. Fal for your consumer product's real-time generation, Frenchie for your agent's ad-hoc image calls inside Claude Code, Cursor, etc.

Why not just use Fal.ai?

Use Fal when inference latency is a user-facing feature. Use Frenchie when you want your agent to generate images as part of a broader tool-calling workflow with zero integration code.

use case

Meeting transcription for AI agents

Originally from the Meeting transcription for AI agents page.

What audio formats work?

MP3, M4A, WAV, MP4, MOV, WEBM. If it plays, Frenchie can transcribe it. Max 2 GB per file.

Does it handle speaker labels?

No — Frenchie returns a single Markdown transcript. Your agent can infer speakers from context, or you can ask it to guess from names mentioned. For enforced speaker diarization, use a dedicated transcription API.

What about non-English meetings?

Thai, Japanese, Chinese, Arabic, French, Spanish, and 50+ other languages work out of the box. Mixed-language meetings work too — no flags needed.

Can I transcribe Zoom / Google Meet recordings?

Yes. Download the recording, drop it in your agent. The audio format won't matter.

use case

Sales call transcription inside your CRM workflow

Originally from the Sales call transcription inside your CRM workflow page.

How is this different from using Gong or Chorus?

Gong and Chorus are full sales-intelligence platforms with recording, scoring, and management dashboards. Frenchie is just the transcription layer — you plug the Markdown into whatever your agent already does (prep, follow-up, CRM updates). Complementary for some teams, replacement for others.

Can I pipe transcripts straight into Salesforce / HubSpot?

Yes — your agent does the piping. Frenchie returns Markdown, your agent uses your CRM's API to attach the transcript to the opportunity, extract fields into custom properties, whatever your workflow needs.

Is the recording stored anywhere?

No. Frenchie processes the file and deletes it. The transcript result expires 30 minutes after first delivery. Save the Markdown where you need it long-term — we don't keep copies.

Does it work for international calls?

Yes. 50+ languages, mixed-language calls work too. If your team sells globally, you don't need separate setups per region.

use case

Podcast transcription for show notes, SEO, and accessibility

Originally from the Podcast transcription for show notes, SEO, and accessibility page.

Can I transcribe episodes with multiple hosts?

Yes. Frenchie returns a single transcript — your agent can usually guess speaker boundaries from content. For strict speaker labels, use a dedicated transcription API and feed its output into your agent instead.

What about episodes with music and sound effects?

Transcribes the spoken content and ignores non-speech audio. If you have a stinger or sound effect mid-dialogue, it won't show up in the transcript, which is usually what you want.

How long does a 90-minute episode take to transcribe?

Roughly 8-10 minutes of processing. The job runs async — your agent can keep generating show notes for earlier episodes while the new one processes.

Can I automate this with a script?

Yes — either via the MCP tool (if your scheduler runs in an MCP-compatible agent) or via the HTTP endpoint at mcp.getfrenchie.dev. Cron the export, drop the file, your agent does the rest.

use case

Research paper parsing — text, tables, and figures

Originally from the Research paper parsing — text, tables, and figures page.

Does it work on scanned/older papers?

Yes. Scanned PDFs that native agent readers return empty on usually work well through Frenchie — the OCR pipeline handles scanned text, old layouts, and weird column structures.

What about papers with equations?

Equations come through as LaTeX where the source was typeset cleanly. For scanned equations, expect Markdown-style approximations with some symbol loss — still readable, not publication-grade.

Can I batch process a whole folder of papers?

Yes. Your agent scripts the loop — it calls ocr_to_markdown once per file, tracks job IDs, collects results. Frenchie handles parallelism on the server side.

How big can a paper be?

2 GB per file. For research papers, that's effectively unlimited — a 2 GB PDF is usually thousands of pages.

use case

Invoice OCR and receipt extraction for bookkeeping agents

Originally from the Invoice OCR and receipt extraction for bookkeeping agents page.

Does it work on photos of receipts?

Yes. JPG, PNG, WebP all work. Skewed, wrinkled, or low-light photos still transcribe in most cases — accuracy drops for extreme cases.

Can I extract to a specific accounting schema?

Your agent does the schema mapping. Frenchie returns clean text; the agent pulls named fields via prompt. Works with any accounting tool that has a documented API — Xero, QuickBooks, FreshBooks, Wave.

What about multi-currency invoices?

Currencies come through as they appear in the invoice. Your agent handles conversion if your accounting tool needs it.

How does this compare to dedicated invoice-capture tools?

Dedicated tools like Dext or AutoEntry preprocess on their schema and are great if they support your vendor templates. Frenchie is more flexible — any invoice shape works, but your agent does the extraction logic. Use Frenchie if your vendors are weird or your schema changes often.

use case

Handwritten notes to searchable Markdown

Originally from the Handwritten notes to searchable Markdown page.

How accurate is handwriting recognition?

For neat print, very accurate. For cursive or abbreviated scrawl, usable but not perfect. Treat the output as a first-pass digital copy, not a verbatim transcript.

Can I OCR diagrams and sketches?

Frenchie extracts handwritten text. Sketches and diagrams come back as extracted figures (PNG files). Your agent can describe them via its vision capability or you can view them directly.

Does it handle multiple languages?

Yes — Thai, Japanese, Chinese, Arabic, French, Spanish, and many others. Mixed-language notes work.

Can I make my whole notebook searchable?

Yes — your agent takes the Markdown and drops it into whatever search system you use (Obsidian, Notion, a local ripgrep workflow). Frenchie is the digitization step; indexing is downstream.

use case

Product mockups for AI agents

Originally from the Product mockups for AI agents page.

What image sizes and formats do you support?

Standard 1024×1024 PNG by default. Other sizes and aspect ratios are supported — your agent passes them as parameters on the generate_image call.

Can I iterate on a prompt?

Yes — your agent can call generate_image repeatedly with refined prompts. Each call is 20 credits. Frenchie doesn't store prompts or images beyond 30 minutes after delivery.

Is it safe for client work?

Generated images are deleted from our storage 30 minutes after your agent retrieves them. We don't train on your prompts. Check your downstream licensing for the specific provider behind the generation.

use case

Blog covers and social images for AI agents

Originally from the Blog covers and social images for AI agents page.

Can I match my brand style?

Yes — include brand colors, typography cues, and composition in the prompt. Image generation responds to specific style direction better than vague aesthetic adjectives.

Can my agent generate social variants (landscape, square, vertical)?

Yes — pass aspect ratio as a parameter on generate_image. Your agent can loop through a set of sizes for one post.

Does this work for OpenGraph and blog platforms?

Yes — the image saves as a standard PNG file. Your agent can drop the path into your frontmatter, upload it to your CMS, or reference it inline in Markdown.