Willy T. Koch created TIKA-4756:
-----------------------------------

             Summary: Detecting Signatures in PDFs with AcroForm
                 Key: TIKA-4756
                 URL: https://issues.apache.org/jira/browse/TIKA-4756
             Project: Tika
          Issue Type: Improvement
          Components: metadata
            Reporter: Willy T. Koch
         Attachments: sigflags_sample.pdf

We see that PDFs that have an Acroform that contains a signture /Sig fields 
aren't detected by the /meta analysis. It detects the AcroForm with  
"pdf:hasAcroFormFields": "true", but nothing on the /Sig part. They are created 
directly in Adobe Acrobat which is also possible in the Free version.

It would be very useful to also return   "hasSignature": "true" (or some other 
signature: property) in these kinds of filees, so we can handle it on our end. 
We use this to exluce PDFs with digital signatures from being reconverted to 
PDF/A.

 

When I run it through the OCRmyPDF, it flags it as digitally signed and exits, 
which is how I first noticed it.

_ocrmypdf sigflags_sample.pdf sigflags_sample_ocrmypdf.pdf_

_DigitalSignatureError: Input PDF has a digital signature. OCR would alter the 
document,_
_invalidating the signature._

 

I've attached a small sample PDF with AcroForm and Signature to reproduce the 
issue.

 

Willy T. Koch

Technical Product manager,

Public 360°

Norway



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to