Hi PDFBox team,

I’m trying to transform pdf to page images in order to perform a OCR.
I’ve used splitting with PDFBox since many years and I’m happy with it.
This year a pdf file with a problematic annotation/signature made some 
problems. (see stack-trace below, same effect on 2.0.26, 2.0.29, 3.0.0RC and 
also the new 3.0.0).

The effect of this problem is that the whole page is evaluated to null. Every 
lib or utility I’ve tested renders this error in a different way, but at least 
the remaining page is converted to an image.
Is there a way to have the same behaviour in PDFBox?

I’ve done a test hack in order to have those annotations working (in 
showAnnotation (PDFStreamEngine) processAnnotation is catched with a msg 
instead of breaking). This is a good test but not knowing the COS streams 
theory I can’t evaluate if this could cause side-effects, and I try to only use 
official libraries without customisations.

I thank you for any hint,
                          My Best Regards David

FATAL [Thread-20] (PdfBoxPage.java:78) - Error: Function must be a Dictionary, 
but is (null)
java.io.IOException: Error: Function must be a Dictionary, but is (null)
                at 
org.apache.pdfbox.pdmodel.common.function.PDFunction.create(PDFunction.java:130)
                at 
org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.<init>(PDSeparation.java:87)
                at 
org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:192)
                at 
org.apache.pdfbox.pdmodel.PDResources.getColorSpace(PDResources.java:223)
                at 
org.apache.pdfbox.pdmodel.PDResources.getColorSpace(PDResources.java:193)
                at 
org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColorSpace.process(SetNonStrokingColorSpace.java:56)
                at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:892)
                at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:530)
                at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processAnnotation(PDFStreamEngine.java:352)
                at 
org.apache.pdfbox.contentstream.PDFStreamEngine.showAnnotation(PDFStreamEngine.java:445)
                at 
org.apache.pdfbox.rendering.PageDrawer.showAnnotation(PageDrawer.java:1522)
                at 
org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:286)
                at 
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:344)
                at 
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:261)
                at 
org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:247)

Reply via email to