use case

Read contracts without the copy-paste tax.

Frenchie parses legal PDFs — scanned or native — into clean Markdown so your agent can flag clauses, extract dates, and diff redlines without you scrubbing formatting by hand.

the problem

Why this is a pain.

You get a 38-page master services agreement as a scanned PDF. Your agent can read maybe half of it through native attachment. You need to flag unusual liability language, extract payment terms, and compare it against your standard template. Doing that on malformed PDF text means your agent misses clauses, confuses sections, and gives you back a summary you can't trust.

Most contracts arrive as scanned PDFs, photocopies of signed originals, or exports from systems that flatten the structure. Native PDF reading returns empty text or garbled tables. You can't review what your agent can't read.

the workflow

How Frenchie handles it.

  1. 1

    Drop the contract PDF (scanned or native) into your agent.

  2. 2

    Your agent calls ocr_to_markdown via Frenchie. The pipeline handles scanned documents, preserves section numbering, and keeps table structure intact for exhibits and schedules.

  3. 3

    Clean Markdown lands in your agent's context — every section, every clause, every exhibit, addressable by heading.

  4. 4

    Your agent now reviews clauses, extracts dates, compares against your template, and flags risks clause by clause.

example

What it looks like in practice.

Your prompt

/ocr ./contracts/acme-msa-draft-v3.pdf

Flag any liability caps, indemnification language, or IP assignment clauses that differ from our standard template. Extract the payment terms and renewal mechanics.

Agent response

Contract parsed — 38 pages, 4 exhibits, saved to .frenchie/acme-msa-draft-v3/result.md.

**Flags vs template**:
- **Liability cap (Section 9.2)**: Capped at 12 months of fees. Our standard is 24 months. Worth pushing back.
- **IP assignment (Section 11.1)**: Assigns foreground IP to customer. Our standard reserves foreground IP for us and grants a license. Material difference.
- **Indemnification (Section 10.4)**: Customer-favorable — we indemnify them for IP infringement and confidentiality breaches. Standard structure otherwise.

**Payment terms**: Net 60, annual commit, auto-renewal with 90-day notice to cancel, price increase capped at CPI + 3% per year.

**Risks**: 12-month liability cap is below standard. IP assignment in Section 11.1 is a real concern — flag to legal before counter-signing.

What lands in your workspace

$frenchie ocr methodology.pdf

✓ 14 pages · 14 credits · saved → result.md

# Section 4 — Results

The collected dataset confirms
the hypothesis across all three
runs.

| Run | Accuracy | Latency |
| --- | -------- | ------- |
|  1  |   94.2%  |  118ms  |
|  2  |   95.0%  |  121ms  |
|  3  |   94.7%  |  119ms  |

*Figure 3.* Distribution shifts
across the validation cohort.

tips

Things worth knowing.

  • Scanned contracts often have lower OCR confidence on handwritten initials, signatures, and margin notes. Don't rely on those fields without human review.
  • For redlines (two versions of the same contract), parse both through Frenchie and let your agent diff them. Markdown diffs cleanly.
  • Privacy: files are processed and deleted — not stored. Results expire 30 minutes after first delivery. For highly sensitive contracts, save the Markdown to your workspace and let the result expire.

questions

Common questions.

Does it preserve section numbering?

Yes. Section hierarchy (1.2.3, Exhibit A, Schedule 2) comes through as Markdown headings your agent can navigate by reference.

What about handwritten signatures or initials?

Signature blocks are transcribed as best-effort. For signature validation, use a dedicated e-signature tool. Frenchie is for reading the text, not verifying execution.

Can I compare two contract versions?

Yes. Parse both, have your agent do a clause-level diff. Markdown diffs cleanly and your agent can summarize material changes.

Is this acceptable for privileged documents?

Files are processed and deleted — not stored, not used for training. Results expire 30 minutes after first delivery. Consult your firm's data handling policy for privileged documents; Frenchie's posture is no-retention, but only your counsel can confirm that's sufficient for your matter.

Try it with a real file of yours.

100 free credits on signup. No card. Drop a PDF or image from your own workflow and see the Markdown your agent gets back.