- The Document Digest by Tensorlake
- Posts
- Tensorlake Updates: Barcode detection, Gemini 3 OCR, and DOCX Tracked Changes
Tensorlake Updates: Barcode detection, Gemini 3 OCR, and DOCX Tracked Changes
Engineers building production document workflows - we shipped some game changers.
Hello AI Engineers! 👋
We're excited to share some powerful new capabilities: Gemini 3 is now available as an OCR model for complex visual reasoning, we've solved the tracked changes problem for legal teams working with DOCX files, and barcodes are now automatically detected and decoded in your document pipelines. Plus, we've added support for new file formats and improved processing speed.
If you haven't tried Tensorlake yet, sign up for free and get in touch to discuss your document processing use case.
🎯 The Highlights
Gemini 3 Now Available for Document Parsing: Google's latest model is now integrated as an OCR engine in Tensorlake. Gemini 3 excels at complex visual reasoning — counting symbols on blueprints, understanding semi-wireless tables, and correlating chart legends to data. Use it when standard OCR falls short on non-trivial layouts. Tensorlake handles rate limits and chunking pages automatically. Read the blog →
DOCX Tracked Changes with Full Spatial Metadata: Legal tech teams, this one's for you. Parse Word documents while preserving insertions, deletions, comments, and bounding boxes in a single API call. Build contract intelligence systems that know both what changed and exactly where. No more choosing between revision history and spatial precision. Read the blog →
Barcode Detection & Decoding: Shipping labels, lab reports, insurance docs, barcodes are everywhere. Tensorlake now automatically detects and decodes them as part of standard parsing. Get barcode type, decoded value, and bounding boxes alongside your text and tables with no extra tooling. Read the changelog → | Try the notebook →
🔧 Product Updates
New File Format Support: Added parsing support for
.ppt(legacy binary PowerPoint) and.rtf(Rich Text Format) files. Broader pipeline compatibility with fewer preprocessing steps.Embedded Images in Output Markdown: Images embedded in your source documents are now preserved in the output markdown. Better context preservation for documents with diagrams, signatures, and visual elements.
Faster On-Prem Processing: Improved
model03speed for on-prem deployments. Same accuracy, faster throughput.
💡 When to Use What
Quick guide on our new OCR options:
Use Case | Recommended Model |
|---|---|
Complex visual reasoning (blueprints, charts, symbols) | Gemini 3 |
Business documents, layout aware parsing with accurate bounding boxes | Model03 |
Tracked changes with spatial data | Standard DOCX parsing |
Barcode extraction | Model03 with |
Try These Updates
Try Tensorlake Free — Test new features with your documents
Schedule a Demo — Questions on implementation? Get hands-on help
Join our Slack — Connect with the community
Thank you for being part of the Tensorlake community. We're here to help you build document pipelines that actually work in production.
Cheers,
The Tensorlake Team