https://bz.apache.org/bugzilla/show_bug.cgi?id=60519

            Bug ID: 60519
           Summary: Extractor for *SSF embeddings
           Product: POI
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: SS Common
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

Created attachment 34555
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34555&action=edit
embedded extractor - changes not related to common ss

Find attached an extractor for various embeddings of excel files.

This is based on the work for [1] and [2].
Apart of evaluating the ClassIDs of Ole10Native objects, this also finds PDFs
hidden in EMFs, which seems to be some specialty of Mac Excel 2011.

I'm not sure if the extraction part in
org.apache.poi.ss.extractor.EmbeddedExtractor should be part of POI or maybe
Tika - but for other type of extraction helper we didn't make this destinction
too.

The code depends on changes to Common SS which I document in a separate issue,
but need to commit it together.

I'll commit the code on the 30.12.2016, if no-one objects earlier ...


[1] http://stackoverflow.com/questions/41101012
[2] http://stackoverflow.com/questions/27011634

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to