[ https://issues.apache.org/jira/browse/TIKA-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135103#comment-14135103 ]
Nick Burch commented on TIKA-1415: ---------------------------------- Your unit tests don't enable recursion, so I'm not sure they're ever going to work. See http://wiki.apache.org/tika/RecursiveMetadata or the Tika Examples for how to select what gets recursed into to be processed There may be a second problem though - these word documents seem to not be directly placed into the powerpoint file, but additionally wrapped in an ole10 native wrapper first. This may cause issues, but we can look at that if the problem does remain once you enable recursion / extraction correctly > PowerPoint2003 embedded with word. The embedded file can not be detected. > ------------------------------------------------------------------------- > > Key: TIKA-1415 > URL: https://issues.apache.org/jira/browse/TIKA-1415 > Project: Tika > Issue Type: Bug > Components: detector, parser > Affects Versions: 1.5 > Environment: window7 > Reporter: sunxingzhe > Labels: Tika, poi > Attachments: PowerPointParserTest.java, word2003.ppt, word2007.ppt > > > Word2003 or word2007 insert into Powerpoint2003 as embedded file。 > The embedded file‘s type can not be detected。 > The embedded file's content can not be parsed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)