Benchmarking the Most Reliable Document Parsing API

Most benchmarks measure text similarity. We measured what breaks in production.

The TL;DR: Traditional metrics don't predict if your RAG pipeline will work or if your automation will fail. So we tested what matters: Can downstream systems actually use this output?

Our Results:

  • 91.7% JSON F1 on structured json extraction measured with F1 (best in class)
    We extract the right fields correctly 92 times out of 100; competitors miss 5+ critical fields per 20 documents

  • 86.79% TEDS on table parsing measured with OmniDocBench (best in class)
    Complex multi-page tables keep their rows and columns intact, even when competitors collapse them into unusable text

  • 56.2% TEDS on document reading measured with OCRBench v2 (best in class)
    We preserve document structure better than anyone; tables stay tables, reading order stays logical

What this means in production: An insurance processor handling 10,000 docs/month needs 45% fewer manual reviews with Tensorlake vs. competitors at similar accuracy levels.

How we stack up:

  • Tensorlake: 91.7% F1 | $10 per 1k pages

  • Gemini: 89% F1 | $30 per 1k pages

  • AWS Textract: 88.4% F1 | $15 per 1k pages

  • Azure: 88.1% F1 | $10 per 1k pages

  • Open-source (Docling/Marker): 68.9% F1 | Free (+ correction costs)

Why Tensorlake wins: 

✅ Preserves complex reading order
✅ Maintains table structure on multi-page tables
✅ Captures charts, figures, and visual content
✅ Delivers structured JSON for automation

Read the full analysis with methodology, datasets, and visual comparisons:
👉 tlake.link/n-benchmarks