BFO adds text extraction to PDF Library

From: BFO
Published: Thu Oct 27 2005

London, England, 27 October 2005, - BFO (Big Faceless Organization), a global supplier of java reporting solutions, strengthens the acclaimed Big Faceless PDF Library with the addition of text and image extraction.

The 2.6.2 release adds the ability to extract text and bitmap images from PDF documents, as well as index the PDF using the Apache Lucene search engine. The library extracts and indexes text in Unicode from the form fields, annotations and document metadata as well as the document body, and at roughly 50 pages a second for large documents.

Speed and accuracy of text extraction coupled with the existing features of the PDF Library makes it a wise choice for developers involved in data mining, content management systems and form processing environments. As well as being beneficial in settings that require the ability to search or extract text from large numbers of PDF files.

Text and image extraction requires the Big Faceless PDF Library Extended Edition plus Viewer license, which can be downloaded from BFO’s website.

About BFO: BFO is a leading global provider of Java based reporting solutions founded in 1998. They produce a stable of robust Java components for the international B2B market. Such components include Report Generator, Graph and PDF Library. Report Generator comprises both Libraries and converts XML to PDF documents. Using JSP, ASP or similar technology, it is possible to create dynamic PDF reports as quickly and easily as HTML.
Company: BFO
Contact Name: Daniel Wilson
Contact Email:
Contact Phone: 442077549107

Visit website »