[ https://issues.apache.org/jira/browse/TIKA-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133795#comment-14133795 ]
James Baker commented on TIKA-1396: ----------------------------------- I don't believe this issue has been fully fixed. Since upgrading to Tik 1.6, I am still unable to extract images from PDF files. I've set the PDF Parser Config, but still no images are present. Could someone please provide a full example of how to do image extraction from PDFs with Tika 1.6 so I can check that it isn't my code? I have tried using Tika App and --extract as well, which was also unable to detect the images. Related SO discussion: http://stackoverflow.com/questions/25783212/extract-images-from-pdf-with-apache-tika > Embedded images in PDF documents > -------------------------------- > > Key: TIKA-1396 > URL: https://issues.apache.org/jira/browse/TIKA-1396 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.5 > Environment: *OS:* > Ubuntu 14.04.1 LTS > *KERNEL:* > 3.13.0-33-generic > gcc version 4.8.2 > *JAVA:* > java version "1.8.0_11" > Java(TM) SE Runtime Environment (build 1.8.0_11-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.11-b03, mixed mode) > Reporter: Damiano > Priority: Critical > Fix For: 1.6 > > > Hello! > I just found a problem with PDF documents that have embedded images. > Doing: > java -jar tika-app-1.5.jar --extract tika.pdf > Tika can not find the image. > Is this a PDF related problem? Because if i do the same operation with a DOC > document Tika finds the image correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)