- Tensorlake Updates
- Posts
- Tensorlake February Updates: Smarter Forms, More Reliable VLMs, Better OCR
Tensorlake February Updates: Smarter Forms, More Reliable VLMs, Better OCR
Engineers building production document workflows — this release is about robustness.
We’ve hardened our VLM parsing stack to handle real-world failure modes, improved OCR structure and spatial grounding, and introduced agentic key-value extraction for template-free form processing.
This release includes:
Agentic key-value extraction for forms
Production-grade mitigation for runaway VLM generation
Improved Gemini-3 OCR with bounding boxes
Smarter table merging
Native multi-page TIFF support
Cloud UI inspection improvements
If you haven't tried Tensorlake yet, sign up for free and get in touch to discuss your document processing use case.
🎯 The Highlights
🤖Agentic Key/Value Extraction (No Templates Required)
The problem: Every loan application, medical intake form, or compliance questionnaire looks different. Traditional extraction requires building custom templates for each variation — expensive, brittle, and impossible to scale.
The solution: Our new Agentic Key/Value Extraction automatically detects form structure and extracts field data without templates or hardcoded rules. Point it at any form and get structured JSON back.
Why it matters:
Process thousands of form variations with a single integration. No template maintenance, no breaking when formats change.
Turn it on with key_value_extraction=True in the SDK or API.
⚙️ Stopping Runaway VLM Generation (Production-Grade Robustness)
The problem: Vision-language models can enter infinite repetition loops on sparse tables and unusual layouts — generating the same text over and over, blowing up latency from 5 seconds to 60+, and dropping content entirely.
The solution: We built a production-grade recovery system that:
Detects runaway generation in real-time using streaming analysis
Stops the model early before it wastes tokens
Recovers missing content through iterative image masking
Production impact:
Worst-case latency dropped from 60s+ to 10-20s
100% content recovery on previously problematic documents
Zero performance impact on normal pages
This system runs in production today and handles edge cases that would otherwise require model retraining.
🔧 Product Updates
Gemini-3 OCR Improvements with Bounding Boxes:
Stronger structure and spatial grounding on complex layouts.
Bounding boxes for every layout element
More stable layout classification
Improved dense table extraction
Cleaner structure. Better downstream validation.
Smarter table merging:
Reconstruct complex, variably split tables more reliably.
Filters inter-table noise
Merges by semantic continuity — not just geometry
Handles multi-page and multi-column splits
Better retrieval. Better reasoning.
Native multi-page TIFF support:
End-to-end multi-page TIFF processing.
Automatic page splitting
No preprocessing required
Fully compatible with OCR and VLM pipelines
Ideal for legacy scanning workflows.
Improved Output Panel in Cloud UI
New tabs for faster inspection:Document Markdown: Full consolidated output
Page Markdown: Per-page split view
Toggle between rendered preview and raw markdown instantly.
Faster validation. Easier debugging.
💡Get Started
Processing forms, tables, or complex layouts? Sign up or schedule a demo to see how these updates can simplify your document workflows.
Thanks for being a part of the Tensorlake Community. We are here to help you build Agentic applications and Document processing pipelines that work for your company!
Cheers,
The Tensorlake Team