News Archive

Release of the SetaPDF-Extractor component2015-02-06

After several month of work and research we finally released the initial version of the SetaPDF-Extractor component.

This component will allow PHP developers to extract text from PDF documents. Furthermore it allows detailed access to words or glyphs and their positions and bounding boxes on a PDF page.

Completely written in PHP and backed up by the SetaPDF-Core component we're very proud to release this product to the public. Any feedback or question is welcome! Just send an email to support@setasign.com.

The full product details are available here.
For a full user manual including an API documentation see here.

Just give it a try

This demo extract simple plain text from a single page:

Select a File

Loading...

Or upload a file

Password for authentication

If the PDF is protected with a password, you can authenticate with this password.

You may also check out the additional demos:

Extract Plain Text

Extract plain text from a PDF document.

Get Words

Get words and their bounding boxes from PDF documents.

Mark Words

Mark or highlight all words on a specific PDF page.

Extract Words By a Specific Location

Use a rectangle filter to limit the result to a specific area.

Phrase Search

Create a phrase search with the SetaPDF-Extractor component.

Count Words

Count words in a PDF document with PHP.