[ https://issues.apache.org/jira/browse/TIKA-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14146558#comment-14146558 ]
James Baker commented on TIKA-1396: ----------------------------------- That will affect my processing, yes. My use case is trying to split a document into separate documents based on a delimiter in the text. If we don't know where the image is on the page, we don't know which document it should be in! Any ideas how that could be worked around? > Embedded images in PDF documents > -------------------------------- > > Key: TIKA-1396 > URL: https://issues.apache.org/jira/browse/TIKA-1396 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.5 > Environment: *OS:* > Ubuntu 14.04.1 LTS > *KERNEL:* > 3.13.0-33-generic > gcc version 4.8.2 > *JAVA:* > java version "1.8.0_11" > Java(TM) SE Runtime Environment (build 1.8.0_11-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.11-b03, mixed mode) > Reporter: Damiano > Priority: Critical > Fix For: 1.6 > > Attachments: tika_images.pdf > > > Hello! > I just found a problem with PDF documents that have embedded images. > Doing: > java -jar tika-app-1.5.jar --extract tika.pdf > Tika can not find the image. > Is this a PDF related problem? Because if i do the same operation with a DOC > document Tika finds the image correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)