Tensorlake Updates
Posts
Tensorlake Updates: Barcode detection, Gemini 3 OCR, and DOCX Tracked Changes

Tensorlake Updates: Barcode detection, Gemini 3 OCR, and DOCX Tracked Changes

Engineers building production document workflows - we shipped some game changers.

Diptanu Gon Choudhury & Shanshan
December 12, 2025

Hello AI Engineers! 👋

We're excited to share some powerful new capabilities: Gemini 3 is now available as an OCR model for complex visual reasoning, we've solved the tracked changes problem for legal teams working with DOCX files, and barcodes are now automatically detected and decoded in your document pipelines. Plus, we've added support for new file formats and improved processing speed.

If you haven't tried Tensorlake yet, sign up for free and get in touch to discuss your document processing use case.

🎯 The Highlights

Gemini 3 Now Available for Document Parsing: Google's latest model is now integrated as an OCR engine in Tensorlake. Gemini 3 excels at complex visual reasoning — counting symbols on blueprints, understanding semi-wireless tables, and correlating chart legends to data. Use it when standard OCR falls short on non-trivial layouts. Tensorlake handles rate limits and chunking pages automatically. Read the blog →
DOCX Tracked Changes with Full Spatial Metadata: Legal tech teams, this one's for you. Parse Word documents while preserving insertions, deletions, comments, and bounding boxes in a single API call. Build contract intelligence systems that know both what changed and exactly where. No more choosing between revision history and spatial precision. Read the blog →
Barcode Detection & Decoding: Shipping labels, lab reports, insurance docs, barcodes are everywhere. Tensorlake now automatically detects and decodes them as part of standard parsing. Get barcode type, decoded value, and bounding boxes alongside your text and tables with no extra tooling. Read the changelog → | Try the notebook →

🔧 Product Updates

New File Format Support: Added parsing support for .ppt (legacy binary PowerPoint) and .rtf (Rich Text Format) files. Broader pipeline compatibility with fewer preprocessing steps.
Embedded Images in Output Markdown: Images embedded in your source documents are now preserved in the output markdown. Better context preservation for documents with diagrams, signatures, and visual elements.
Faster On-Prem Processing: Improved model03 speed for on-prem deployments. Same accuracy, faster throughput.

💡 When to Use What

Quick guide on our new OCR options:

Use Case	Recommended Model
Complex visual reasoning (blueprints, charts, symbols)	Gemini 3
Business documents, layout aware parsing with accurate bounding boxes	Model03
Tracked changes with spatial data	Standard DOCX parsing
Barcode extraction	Model03 with `barcode_detection=true`

Try These Updates

Try Tensorlake Free — Test new features with your documents
Schedule a Demo — Questions on implementation? Get hands-on help
Join our Slack — Connect with the community

Thank you for being part of the Tensorlake community. We're here to help you build document pipelines that actually work in production.

Cheers,
The Tensorlake Team