[ 
https://issues.apache.org/jira/browse/TIKA-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135103#comment-14135103
 ] 

Nick Burch commented on TIKA-1415:
----------------------------------

Your unit tests don't enable recursion, so I'm not sure they're ever going to 
work. See http://wiki.apache.org/tika/RecursiveMetadata or the Tika Examples 
for how to select what gets recursed into to be processed

There may be a second problem though - these word documents seem to not be 
directly placed into the powerpoint file, but additionally wrapped in an ole10 
native wrapper first. This may cause issues, but we can look at that if the 
problem does remain once you enable recursion / extraction correctly

> PowerPoint2003 embedded with word. The embedded file can not be detected.
> -------------------------------------------------------------------------
>
>                 Key: TIKA-1415
>                 URL: https://issues.apache.org/jira/browse/TIKA-1415
>             Project: Tika
>          Issue Type: Bug
>          Components: detector, parser
>    Affects Versions: 1.5
>         Environment: window7
>            Reporter: sunxingzhe
>              Labels: Tika, poi
>         Attachments: PowerPointParserTest.java, word2003.ppt, word2007.ppt
>
>
> Word2003 or word2007  insert into Powerpoint2003 as embedded file。
> The embedded file‘s type can not be detected。
> The embedded file's content can not be parsed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to