[ https://issues.apache.org/jira/browse/PDFBOX-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tilman Hausherr resolved PDFBOX-5879. ------------------------------------- Fix Version/s: 2.0.33 3.0.4 PDFBox 4.0.0 Assignee: Tilman Hausherr Resolution: Fixed Thank you. It's not the commit, it's poor programming that got exposed because of the commit. > Regression from PDFBOX-5841: Text extraction with rotation magic fails for > PDF with multiple content streams in a page > ---------------------------------------------------------------------------------------------------------------------- > > Key: PDFBOX-5879 > URL: https://issues.apache.org/jira/browse/PDFBOX-5879 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Affects Versions: 2.0.32, 3.0.3 PDFBox > Reporter: Gábor Stefanik > Assignee: Tilman Hausherr > Priority: Major > Fix For: 2.0.33, 3.0.4 PDFBox, 4.0.0 > > Attachments: MVM_Aram_augusztus.pdf > > > {code:java} > java -jar pdfbox-app-3.0.3.jar export:text -console -rotationMagic > -i="MVM_Aram_augusztus.pdf" {code} > fails with the following error: > {code:java} > java.lang.ClassCastException: class org.apache.pdfbox.cos.COSObject cannot be > cast to class org.apache.pdfbox.cos.COSArray (org.apache.pdfbox.cos.COSObject > and org.apache.pdfbox.cos.COSArray are in unnamed module of loader 'app') > at > org.apache.pdfbox.tools.ExtractText.extractPages(ExtractText.java:336) > at org.apache.pdfbox.tools.ExtractText.call(ExtractText.java:225) > at org.apache.pdfbox.tools.ExtractText.call(ExtractText.java:62) > at picocli.CommandLine.executeUserObject(CommandLine.java:2045) > at picocli.CommandLine.access$1500(CommandLine.java:148) > at > picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2465) > at picocli.CommandLine$RunLast.handle(CommandLine.java:2457) > at picocli.CommandLine$RunLast.handle(CommandLine.java:2419) > at > picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2277) > at picocli.CommandLine$RunLast.execute(CommandLine.java:2421) > at picocli.CommandLine.execute(CommandLine.java:2174) > at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:76) {code} > The same command succeeds in 3.0.2. > The triggering PDF can be downloaded from > [https://nagykorosiallatmentok.hu/wp-content/uploads/2023/09/MVM_Aram_augusztus.pdf,] > and is also attached. > The root cause appears to be this change: > [https://github.com/apache/pdfbox/commit/b03d12d56dd74e5c52d80cf0b80c5bfb1f3209b2] > from PDFBOX-5841 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org