[ https://issues.apache.org/jira/browse/TIKA-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison resolved TIKA-1297. ------------------------------- Resolution: Fixed Fix Version/s: 1.6 > Images not being extracted from PDFs > ------------------------------------ > > Key: TIKA-1297 > URL: https://issues.apache.org/jira/browse/TIKA-1297 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.5 > Reporter: James Baker > Fix For: 1.6 > > > Images embedded within PDF documents are not being extracted by Tika. I have > tested this via the command line (where the -z option fails to extract any > images), and by inspecting the XHTML version of the PDF produced by Tika > (where the image tags are not included in the output). > The images are extractable by PDFBox, so Tika should be able to extract them > and include them in the XHTML output. -- This message was sent by Atlassian JIRA (v6.3.4#6332)