[ 
https://issues.apache.org/jira/browse/PDFBOX-5595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720223#comment-17720223
 ] 

Andreas Lehmkühler commented on PDFBOX-5595:
--------------------------------------------

[~tilman] thanks for double checking

> Slight regression on corrupt bug tracker file
> ---------------------------------------------
>
>                 Key: PDFBOX-5595
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5595
>             Project: PDFBox
>          Issue Type: Task
>          Components: Parsing
>    Affects Versions: 2.0.28, 3.0.0 PDFBox
>            Reporter: Tim Allison
>            Assignee: Andreas Lehmkühler
>            Priority: Trivial
>             Fix For: 2.0.29, 3.0.0 PDFBox
>
>
> I'm not sure this is a regression, and apologies if you already dealt with 
> this before the release of 2.0.28.  Also, as a warning, this file is corrupt.
>  
> We used to get more text out of this file in 2.0.27 than we do now in 2.0.28: 
> [https://corpora.tika.apache.org/base/docs/bug_trackers/evince/evince-395-0.zip-0.pdf]
>  
> This file derived from the evince bug tracker, which now eventually links to 
> this issue:
> [https://gitlab.freedesktop.org/poppler/poppler/-/issues/323]
>  
> This image from the poppler issue shows what we get with PDFBox 2.0.28 on the 
> left, and 2.0.27 on the right.
>  
> If the decision is "the file is corrupt -> not going to fix", I completely 
> understand.
> !https://gitlab.gnome.org/GNOME/evince/uploads/0bc2302dbafc0bbc2110f0d42951428e/evince.JPG!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to