[ https://issues.apache.org/jira/browse/TIKA-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871744#comment-17871744 ]
Hudson commented on TIKA-4296: ------------------------------ SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk11 #1734 (See [https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1734/]) TIKA-4296: revert last commit (tilman: [https://github.com/apache/tika/commit/1f9707a994bff8551b030770a7145899bcbf6948]) * (edit) tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java > "Parameter must be 1-based, but is -1" when using Tika with PDFBox 2.0.32 > ------------------------------------------------------------------------- > > Key: TIKA-4296 > URL: https://issues.apache.org/jira/browse/TIKA-4296 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 2.9.2 > Reporter: Thomas Mortagne > Assignee: Tilman Hausherr > Priority: Major > Fix For: 3.0.0, 2.9.3 > > Attachments: pdf.pdf > > > I just upgraded my pdfbox dependency to 2.0.32 and any Tika#parseToString of > a pdf file seems to produce the following warning: > {noformat} > WARN o.apache.pdfbox.text.PDFTextStripper - Parameter must be 1-based, but > is -1 > {noformat} > The behavior is the same as with 2.0.31, it's just that pdfbox is apparently > not too happy anymore with the way it's used by Tika. > This new warning was apparently introduced by PDFBOX-5822. > Just in case it's not actually any file, here is one with which I reproduce: > [^pdf.pdf] -- This message was sent by Atlassian Jira (v8.20.10#820010)