Extract text, glyphs, words and metrics from PDF documents with PHP

SetaPDF-Extractor

Extract text, glyphs, words and metrics from PDF documents with PHP

Downloads and Changelogs of the SetaPDF-Extractor

The following table will show you all changelogs and available downloads of the SetaPDF-Extractor component. A full overview of all your licenses is available in your personal Pickup Depot.

SetaPDF-Extractor

Version 2.42.0.1871

Release date: 2023-08-29
SetaPDF-Extractor Component
Feature
  • Add $ignoreFaultyStreams parameter to SetaPDF_Extractor::__construct() to allow processing of faulty documents.
  • Add setIgnoreFaultyStreams() method in all strategies.
Bugfix
  • Fixed font-size calculation in FontSize filter for rotated text items.
  • Be a bit less restrictive in font-size comparsion for mode "equals" in FontSize filter.
  • Fixed WordGroup strategy behavior with rotated text.
  • Fixed ContentStreamCleaner in view to handling of inline-images.
Tweak
  • Make use of SetaPDF_Core::isZero() and isNotZero() methods.
SetaPDF-Core Component
Feature
  • Added getAll() method to CMAP classes of TTF parser.
Bugfix
  • Added SetaPDF_Core_Document_Destination::getIndirectObject() method.
  • Only allow specific data types as keys in Tree data structures.
  • Check for explicit class types instead of "null" on various locations.
Tweak
  • Use window size of 31 in decompression fallback for FlateDecode filter.
  • Record object offset of cross-reference streams during initial scan.
  • Ignore invalid value for "additional actions" entry in catalog dictionary.
  • Fixed that Autoload.php required the main directory had to be named "SetaPDF".
  • Added check for fseek() call in StreamReader class.
  • Marked method names with typo in their names as deprecated.
  • Code style, doc-block optimizations and cleanup.
  • Use mb_string_split() in SetaPDF_Core_Encoding::strSplit() if available.
  • Implemented and make use of SetaPDF_Core::isZero() and isNotZero() methods.
  • Handle/Ignore invalid key types in tree structures.
  • Check for allowed values in various setter methods in SetaPDF_Core_Document_Catalog_ViewerPreferences.