-->

Super Early Bird Pricing for KMWorld 2026 Available for a Limited Time!
Register NOW for November 16-19. Use code SUPERSAVINGS.

ABBYY introduces FineReader Engine integration with DocLang

ABBYY is releasing the ABBYY FineReader Engine 12.8.0 that exports to DocLang—providing developers a unified, AI-readable format to represent documents for language model and agentic AI consumption, saving them time and increasing document processing performance.

ABBYY recently demonstrated FineReader Engine processing unprecedented speeds of 2,160,000 pages per hour at its ABBYY Ascend event. Additionally, in a side-by-side benchmark, ABBYY compared the processing of a PDF and DocLang document. In the controlled experiment, the same document for the same complex task using the same AI model was configured identically. The only variable was the document representation in PDF and DocLang, according to the company.

FineReader Engine with DocLang significantly improved output quality, increased structural accuracy, decreased token usage, and reduced latency, ABBYY said.

The controlled experiment processes an annual report, a clinical study, and vendor contract that represent the vast variety of enterprise documents rich with information created for human understanding but challenging for machines to parse.

“ABBYY FineReader Engine is already used by thousands of organizations processing billions of documents every year,” said Max Vermeir, VP of AI strategy at ABBYY. “Now with DocLang as an AI native format, more companies will be able to accelerate innovation and have faster access to their business data to make smarter, more impactful decisions.”

ABBYY, IBM, HumanSignal, NVIDIA, and Red Hat, formed the DocLang working group to revolutionize AI document parsing. Current document formats such as PDF, HTML, Markdown, and others, were designed for human consumption rather than for AI interpretation. The result is a patchwork of partial solutions requiring custom parsing at every integration point that burdens developers with building custom parsers, is prone to hallucinations, and complicates regulatory compliance.

"DocLang is specifically engineered to address industry challenges with a minimal, standardized, and AI-native method for representing document structure, meaning, layout, and governance. FineReader Engine with DocLang support was designed for efficient machine processing and a predictable structure optimized for modern AI tokenization and modeling techniques. Organizations will see a significant difference with more reliable interpretation, increased accuracy, and lower computational costs,” Vermeir concluded.

More information about the DocLang working group can be found at https://github.com/doclang-project

For more information about this news, visit www.abbyy.com.

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues