On 7/19/22 2:33 AM, Aki Tuomi wrote:
Jul 18 21:28:23 mx-test tika[18970]: DEBUG [qtp977522995-24] 21:28:23,264 
org.apache.tika.parser.pdf.PDFParser File: 
/tmp/apache-tika-9115808773791090696.tmp, length: 104932, md5: 
092bf24b2cac33fac27965549c99613a

You can see if this matches with your PDF file. But after that, it complains 
that the PDF is corrupted. So I think the first step would be to validate if 
length and MD5 sum matches with your input data.

working on it.

managed to run verbose/DEBUG tika instance under jdb, @ receipt of submit from 
dovecot

        https://lists.apache.org/thread/pwoc3f4o3gh51y3jhz2x44g4mn51wbbj

but, as yet, not successfully capturing the file at pdfParser bkpt

question -- what is *intended* for dovecot fts-tika to submit to the tika 
backend?  'should' it be submitting the received email's complete/unmodified 
attachment?
or some modification of it?

Reply via email to