[ 
https://issues.apache.org/jira/browse/TIKA-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225283#comment-14225283
 ] 

Tilman Hausherr commented on TIKA-1442:
---------------------------------------

[~talli...@apache.org] I'm really wondering why you'd get any extracted text 
from e.g. 717226.pdf, because it has no extract permission. The permissions in 
PDF files are only enforced by the application (i.e. PDFBox), i.e. the text 
information isn't stored separately in encrypted form.

> Upgrade to PDFBox 1.8.8
> -----------------------
>
>                 Key: TIKA-1442
>                 URL: https://issues.apache.org/jira/browse/TIKA-1442
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>             Fix For: 1.8
>
>         Attachments: PDFBox_1_8_6VPDFBox_1_8_8-b145.xlsx, 
> PDFBox_1_8_6VPDFBox_1_8_8-b145.zip, 
> PDFBox_1_8_8-ClassicVPDFBox_1_8_8-NonSeq.xlsx, 
> pdfbox_1_8_6V1_8_8-SNAPSHOT.xlsx, pdfbox_1_8_6V1_8_8-SNAPSHOTb.xlsx, 
> pdfbox_1_8_6V1_8_8-SNAPSHOTc.xlsx, pdfbox_1_8_6V1_8_8-SNAPSHOTc.zip
>
>
> Given the regressions we identified in PDFBox 1.8.7, we should upgrade to 
> 1.8.8 as soon as it is ready.  I'm tempted to call this a blocker on Tika 
> 1.7.  Let's use this issue to carry on the discussion of regression testing 
> (if any further discussion is necessary) or any other prep that needs to 
> happen before 1.8.8's release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to