use case

Extract text and figures from papers your agent should actually read.

Frenchie parses PDF research papers into clean Markdown, pulling figures out as separate PNG files so your agent can cite specific content instead of summarizing the whole thing.

the problem

Why this is a pain.

You point your agent at a 20-page paper. It reads the PDF via native attachments, burns 30K tokens on the raw bytes, and still misses half the figures because they come back as low-res image data. You wanted to ask a specific question about Table 3 and Figure 5. The agent gives you a vague paragraph instead.

Native PDF reading in most agents handles text roughly and figures badly. Figures show up as inline image attachments, not standalone assets your agent can point at. Tables lose their structure. Equations often turn into garbled characters. You end up downloading the paper yourself and reading it the old way.

the workflow

How Frenchie handles it.

  1. 1

    Drop the paper into your agent.

  2. 2

    Your agent calls ocr_to_markdown via Frenchie. The extraction pipeline handles text, preserves table structure, extracts figures as separate PNG files.

  3. 3

    The Markdown returned has clean text with inline references to figure files like ![Figure 3](./figures/fig-3.png).

  4. 4

    Your agent can now answer specific questions about tables, cite figures by number, or pass individual figure files to a vision model for deeper analysis.

example

What it looks like in practice.

Your prompt

/ocr ./papers/attention-is-all-you-need.pdf

What's the architecture described in Figure 1, and how does the per-layer complexity compare to recurrent networks according to Table 1?

Agent response

Paper parsed — 16 pages, 3 figures extracted as PNGs, Markdown saved to .frenchie/attention-is-all-you-need/result.md.

**Figure 1**: The Transformer architecture — encoder on the left, decoder on the right. Each layer uses multi-head self-attention plus a position-wise feed-forward network. No recurrence, no convolution. Positional encodings are added to the input embeddings.

**Per-layer complexity (Table 1)**:
- Self-attention: O(n² · d) — quadratic in sequence length, linear in dimension.
- Recurrent: O(n · d²) — linear in sequence length, quadratic in dimension.
- For typical model dimensions (d > n), self-attention is actually cheaper per layer. The paper argues this flips when sequences get long enough, which is why later work explored sparse attention variants.

What lands in your workspace

$frenchie ocr methodology.pdf

✓ 14 pages · 14 credits · saved → result.md

# Section 4 — Results

The collected dataset confirms
the hypothesis across all three
runs.

| Run | Accuracy | Latency |
| --- | -------- | ------- |
|  1  |   94.2%  |  118ms  |
|  2  |   95.0%  |  121ms  |
|  3  |   94.7%  |  119ms  |

*Figure 3.* Distribution shifts
across the validation cohort.

tips

Things worth knowing.

  • Figures come back as PNG files, one per figure — not inline images in the Markdown. Your agent references them by filename.
  • Tables preserve cell structure as Markdown tables. Equations preserve as LaTeX where detectable.
  • Cost is predictable: 1 credit per page. A typical 15-page conference paper runs $0.15. A 100-page thesis runs $1.

questions

Common questions.

Does it work on scanned/older papers?

Yes. Scanned PDFs that native agent readers return empty on usually work well through Frenchie — the OCR pipeline handles scanned text, old layouts, and weird column structures.

What about papers with equations?

Equations come through as LaTeX where the source was typeset cleanly. For scanned equations, expect Markdown-style approximations with some symbol loss — still readable, not publication-grade.

Can I batch process a whole folder of papers?

Yes. Your agent scripts the loop — it calls ocr_to_markdown once per file, tracks job IDs, collects results. Frenchie handles parallelism on the server side.

How big can a paper be?

2 GB per file. For research papers, that's effectively unlimited — a 2 GB PDF is usually thousands of pages.

Try it with a real file of yours.

100 free credits on signup. No card. Drop a PDF or image from your own workflow and see the Markdown your agent gets back.