Hi Tilman thank you for your answer. The PDF is a real document so I can't share it, but I can give you an extract:
Those are the first 1044 bytes of the document. -------------------------------------------------------------- *PK ¹Js: ¼àð3£ 3£ < CAACT-00-00-08 document.pdf*%PDF-1.6 %âãÏÓ 3582 0 obj <</Linearized 1/L 697139/O 3585/E 118808/N 42/T 625450/H [ 1000 1986]>> endobj xref 3582 34 0000000016 00000 n 0000003154 00000 n 0000003481 00000 n 0000003680 00000 n 0000004019 00000 n 0000004048 00000 n 0000004265 00000 n 0000004495 00000 n 0000004765 00000 n 0000004950 00000 n 0000006189 00000 n 0000007372 00000 n 0000007629 00000 n 0000060752 00000 n 0000061525 00000 n 0000062245 00000 n 0000062284 00000 n 0000062509 00000 n 0000062740 00000 n 0000062819 00000 n 0000064540 00000 n 0000064945 00000 n 0000065082 00000 n 0000065306 00000 n 0000065606 00000 n 0000072471 00000 n 0000075166 00000 n 0000078960 00000 n 0000079194 00000 n 0000079411 00000 n 0000118645 00000 n 0000118722 00000 n 0000002986 00000 n 0000001000 00000 n trailer <</Size 3616/Prev 625437/XRefStm 2986/Root 3583 0 R/Info 3580 0 R/ID[<A71F76F2A24FB6D888EDCB04CB86B815><6CCE97BD63E74F479ED22F39881647F0>]>> startxref 0 %%EOF ..... -------------------------------------------------------------- I would to bring your attention to the first 60 bytes. Those bytes are stripped out by the *COSParser *parser, skipped like garbage. The method that skips those bytes is: COSParser.parserHeader(PDF_HEADER, PDF_DEFAULT_VERSION) .... private static final String PDF_HEADER = "%PDF-"; I've noticed that I must to manually skip too those 60 bytes from the *pdfInputStream *before to call the method signature.getSignedContent ( *pdfInputStream *) In this way, the returned byte-array digest HASH and the HASH inside signature match. Andrea On Wed, Jun 8, 2016 at 6:06 PM, Tilman Hausherr <[email protected]> wrote: > Am 08.06.2016 um 13:27 schrieb Andrea Canu: > >> Hi guys >> >> I want to ask you about the correct way to get the signed-content from the >> signature. >> Since now I've used the PDSignature class's method: >> >> signature.getSignedContent ( *pdfInputStream *) >> >> With this method I'm able to extract from the *pdfInputStream *the >> byte-array of the signed-content based on the signature's ByteRange. >> >> I've noticed that if I try to verify the signature based on that >> byte-array, the verification sometime unexpectedly fails! >> > > Hello Andrea, > > Can you share the PDF (upload it)? > > I doubt your theory re: bug in COSParser. I'd rather search if there is a > bug in COSFilterInputStream. > > If you can't share the PDF, then please download the bytes "the hard way": > > // download the signed content, described in > /ByteRange COSArray: > // [offset1 len1 offset2 len2] > int[] byteRange = sig.getByteRange(); > byte[] buf = new byte[byteRange[1] + byteRange[3]]; > RandomAccessFile raf = new RandomAccessFile(infile, > "r"); > raf.seek(byteRange[0]); > raf.readFully(buf, byteRange[0], byteRange[1]); > raf.seek(byteRange[2]); > raf.readFully(buf, byteRange[1], byteRange[3]); > raf.close(); > > This code is not fully correct, because /ByteRange might have more than 4 > elements. So have a look at it to be sure. > > Then compare the byte array "buf" with the one from getSignedContent. > > Another possibility that it fails might be that there are different > signature methods. See the code at > > https://svn.apache.org/viewvc/pdfbox/branches/2.0/examples/src/main/java/org/apache/pdfbox/examples/signature/ShowSignature.java?view=markup > > I didn't use getsignedContent() there but I think I should. So I'd be very > interested to find out if there is a bug there. > > Tilman > > >> Now, looking at the COSParser class I've found this method : >> >> COSParser.parseHeader >> >> >> This method, trying to find the correct document's header, is able to skip >> some garbage in the PDF document looking for the markers "%PDF-" and >> "%FDF-". >> >> So, I've noticed that the signature verification succeed if I skip that >> garbage during the signed-content extraction. >> >> My question is: >> Why this garbage-management is not present also into the getSignedContent >> code? >> >> The workaround I found is to skip that garbage manually from the >> *pdfInputStream*, but now the problem is the correct way to calculate the >> offset for the *pdfInputStream.* >> >> Any suggestion? >> >> Kinds regards >> Andrea. >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >

