[ 
https://issues.apache.org/jira/browse/TIKA-4756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18088454#comment-18088454
 ] 

Willy T. Koch commented on TIKA-4756:
-------------------------------------

Thanks for the insight!

I tested PDFbox and got the navigator to view the file, but it has some OS font 
issue so it's useless. I used Claude Code on it instead which did the same 
analysis with a python script it made.

It confirms what you both say, all signature fields are empty so in that sense 
it's correct that Tika reports hasSignature = false.

A useful addition to flag these signature fields could be to have a property 
signature:containsFields:true  or something similar.

But as you've spent plenty of time, you can also decide to just close it and 
we'll add some custom code to flag these, as first suggested.
h3. /Sig Fields — 7 total, all unsigned
||#||Field name||Status||
|1|{{2 - E-Signature of applicant}}|Empty (not signed)|
|2|{{6 - 8 E-Signature Head of Training type rating or NPCT extension to 
SPO}}|Empty|
|3|{{9 - 10 E-signature}}|Empty|
|4|{{12 - 3 E-Signature of applicant}}|Empty|
|5|{{13 - 3 E-signature of TRI}}|Empty|
|6|{{16 - 6 E-Signature}}|Empty|
|7|{{17 - 3 E-signature}}|Empty|

*None of the signature fields have been signed.* Each {{/V}} entry is absent — 
the fields are present in the AcroForm structure but contain no cryptographic 
signature data (no {{{}/Filter{}}}, {{{}/SubFilter{}}}, {{{}/ByteRange{}}}, 
{{{}/Contents{}}}, etc.).

On Page 4 in the signature field with the blue inserted signature, it's just an 
inserted image.

*{{9 - 10 E-signature}} ({{{}/FT=/Sig{}}}, {{{}/V=None{}}})* — this is the 
designated cryptographic e-signature field for the examiner, and it is 
{*}empty/unsigned{*}, exactly like all the other {{/Sig}} fields.

In short: *yes, it's just an embedded image* — a picture of a handwritten 
signature baked into the page content, with no cryptographic binding 
whatsoever. The actual {{/Sig}} field {{9 - 10 E-signature}} sitting right next 
to it in the same row is still unsigned. The image carries zero legal/technical 
signature value from a PDF signature standpoint.

> Detecting Signatures in PDFs with AcroForm
> ------------------------------------------
>
>                 Key: TIKA-4756
>                 URL: https://issues.apache.org/jira/browse/TIKA-4756
>             Project: Tika
>          Issue Type: Improvement
>          Components: metadata
>            Reporter: Willy T. Koch
>            Priority: Minor
>              Labels: Signature
>         Attachments: image-2026-06-11-18-05-01-275.png, sigflags_sample.pdf, 
> signature.png
>
>
> We see that PDFs that have an Acroform that contains a signture /Sig fields 
> aren't detected by the /meta analysis. It detects the AcroForm with  
> "pdf:hasAcroFormFields": "true", but nothing on the /Sig part. They are 
> created directly in Adobe Acrobat which is also possible in the Free version.
> It would be very useful to also return   "hasSignature": "true" (or some 
> other signature: property) in these kinds of filees, so we can handle it on 
> our end. We use this to exluce PDFs with digital signatures from being 
> reconverted to PDF/A.
>  
> When I run it through the OCRmyPDF, it flags it as digitally signed and 
> exits, which is how I first noticed it.
> _ocrmypdf sigflags_sample.pdf sigflags_sample_ocrmypdf.pdf_
> _DigitalSignatureError: Input PDF has a digital signature. OCR would alter 
> the document,_
> _invalidating the signature._
>  
> I've attached a small sample PDF with AcroForm and Signature to reproduce the 
> issue.
>  
> Willy T. Koch
> Technical Product manager,
> Public 360°
> Norway



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to