This is an automated email from the ASF dual-hosted git repository.
tballison pushed a change to branch haystack-poi-embedded-filenames
in repository https://gitbox.apache.org/repos/asf/tika.git
from 1d6a8d79b6 embedded file names and pagination in hslf
add 89095247d7 improve pagination metadata
No new revisions were added by this update.
Summary of changes:
.../main/java/org/apache/tika/metadata/Office.java | 30 +++++
.../org/apache/tika/metadata/PageAnchoring.java | 139 +++++++++++++++++++++
.../org/apache/tika/metadata/TikaPagedText.java | 39 +++++-
.../apache/tika/metadata/TestPageAnchoring.java | 118 +++++++++++++++++
.../tika/parser/microsoft/ExcelExtractor.java | 115 +++++++++++++++--
.../tika/parser/microsoft/HSLFExtractor.java | 7 +-
.../microsoft/ooxml/AbstractOOXMLExtractor.java | 38 +++++-
.../ooxml/SXSLFPowerPointExtractorDecorator.java | 70 +++++++++++
.../ooxml/XSSFExcelExtractorDecorator.java | 82 +++++++++++-
.../tika/parser/microsoft/ExcelParserTest.java | 63 ++++++++++
.../parser/microsoft/PowerPointParserTest.java | 53 ++++++++
.../apache/tika/parser/odf/OpenDocumentParser.java | 103 ++++++++++++++-
.../org/apache/tika/parser/odf/ODFParserTest.java | 87 +++++++++++++
13 files changed, 922 insertions(+), 22 deletions(-)
create mode 100644
tika-core/src/main/java/org/apache/tika/metadata/PageAnchoring.java
create mode 100644
tika-core/src/test/java/org/apache/tika/metadata/TestPageAnchoring.java