Hi PDFBox team, I’m trying to transform pdf to page images in order to perform a OCR. I’ve used splitting with PDFBox since many years and I’m happy with it. This year a pdf file with a problematic annotation/signature made some problems. (see stack-trace below, same effect on 2.0.26, 2.0.29, 3.0.0RC and also the new 3.0.0).
The effect of this problem is that the whole page is evaluated to null. Every lib or utility I’ve tested renders this error in a different way, but at least the remaining page is converted to an image. Is there a way to have the same behaviour in PDFBox? I’ve done a test hack in order to have those annotations working (in showAnnotation (PDFStreamEngine) processAnnotation is catched with a msg instead of breaking). This is a good test but not knowing the COS streams theory I can’t evaluate if this could cause side-effects, and I try to only use official libraries without customisations. I thank you for any hint, My Best Regards David FATAL [Thread-20] (PdfBoxPage.java:78) - Error: Function must be a Dictionary, but is (null) java.io.IOException: Error: Function must be a Dictionary, but is (null) at org.apache.pdfbox.pdmodel.common.function.PDFunction.create(PDFunction.java:130) at org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.<init>(PDSeparation.java:87) at org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:192) at org.apache.pdfbox.pdmodel.PDResources.getColorSpace(PDResources.java:223) at org.apache.pdfbox.pdmodel.PDResources.getColorSpace(PDResources.java:193) at org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColorSpace.process(SetNonStrokingColorSpace.java:56) at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:892) at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:530) at org.apache.pdfbox.contentstream.PDFStreamEngine.processAnnotation(PDFStreamEngine.java:352) at org.apache.pdfbox.contentstream.PDFStreamEngine.showAnnotation(PDFStreamEngine.java:445) at org.apache.pdfbox.rendering.PageDrawer.showAnnotation(PageDrawer.java:1522) at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:286) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:344) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:261) at org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:247)