• Tensorlake Updates
  • Posts
  • Tensorlake February Updates: Smarter Forms, More Reliable VLMs, Better OCR

Tensorlake February Updates: Smarter Forms, More Reliable VLMs, Better OCR

Engineers building production document workflows — this release is about robustness.

We’ve hardened our VLM parsing stack to handle real-world failure modes, improved OCR structure and spatial grounding, and introduced agentic key-value extraction for template-free form processing.

This release includes:

  • Agentic key-value extraction for forms

  • Production-grade mitigation for runaway VLM generation

  • Improved Gemini-3 OCR with bounding boxes

  • Smarter table merging

  • Native multi-page TIFF support

  • Cloud UI inspection improvements

If you haven't tried Tensorlake yet, sign up for free and get in touch to discuss your document processing use case.

🎯 The Highlights

 🤖Agentic Key/Value Extraction (No Templates Required)

The problem: Every loan application, medical intake form, or compliance questionnaire looks different. Traditional extraction requires building custom templates for each variation — expensive, brittle, and impossible to scale.

The solution: Our new Agentic Key/Value Extraction automatically detects form structure and extracts field data without templates or hardcoded rules. Point it at any form and get structured JSON back.

Why it matters:

Process thousands of form variations with a single integration. No template maintenance, no breaking when formats change.

Turn it on with key_value_extraction=True in the SDK or API.

⚙️ Stopping Runaway VLM Generation (Production-Grade Robustness)

The problem: Vision-language models can enter infinite repetition loops on sparse tables and unusual layouts — generating the same text over and over, blowing up latency from 5 seconds to 60+, and dropping content entirely.

The solution: We built a production-grade recovery system that:

  • Detects runaway generation in real-time using streaming analysis

  • Stops the model early before it wastes tokens

  • Recovers missing content through iterative image masking

Production impact:

  • Worst-case latency dropped from 60s+ to 10-20s

  • 100% content recovery on previously problematic documents

  • Zero performance impact on normal pages

This system runs in production today and handles edge cases that would otherwise require model retraining.

🔧 Product Updates

  • Gemini-3 OCR Improvements with Bounding Boxes:

    Stronger structure and spatial grounding on complex layouts.

    • Bounding boxes for every layout element

    • More stable layout classification

    • Improved dense table extraction

    Cleaner structure. Better downstream validation.

  • Smarter table merging: 

    Reconstruct complex, variably split tables more reliably.

    • Filters inter-table noise

    • Merges by semantic continuity — not just geometry

    • Handles multi-page and multi-column splits

    Better retrieval. Better reasoning.

  • Native multi-page TIFF support:

    End-to-end multi-page TIFF processing.

    • Automatic page splitting

    • No preprocessing required

    • Fully compatible with OCR and VLM pipelines

    Ideal for legacy scanning workflows.

  • Improved Output Panel in Cloud UI
    New tabs for faster inspection:

    • Document Markdown: Full consolidated output

    • Page Markdown: Per-page split view

    Toggle between rendered preview and raw markdown instantly.

    Faster validation. Easier debugging.

💡Get Started

Processing forms, tables, or complex layouts? Sign up or schedule a demo to see how these updates can simplify your document workflows.

Thanks for being a part of the Tensorlake Community. We are here to help you build Agentic applications and Document processing pipelines that work for your company!

Cheers,
The Tensorlake Team