[ https://issues.apache.org/jira/browse/TIKA-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated TIKA-988: ------------------------------------ Attachment: bug31373.xls Example doc. We do correctly extract from the embedded docs, we just don't leave a placeholder. > We don't extract a placeholder for a Word document embedded in an Excel > document > -------------------------------------------------------------------------------- > > Key: TIKA-988 > URL: https://issues.apache.org/jira/browse/TIKA-988 > Project: Tika > Issue Type: Improvement > Reporter: Michael McCandless > Fix For: 1.3 > > Attachments: bug31373.xls > > > In TIKA-956 we fixed the Word parser so that at the point where an embedded > document appears, we output a <div class="embedded" id="_XXX"/> tag. > It would be nice to do this for documents embedded in Excel too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira