RE: extract text from a password protected PDF

2023-10-03 Thread David Francescato
Tesseract/Tess4J is a good OCR combo, Tess4j uses PDFBOX for pdf for pdf2imgs -Original Message- From: Tilman Hausherr Sent: Tuesday, 3 October 2023 10:05 To: users@pdfbox.apache.org Subject: Re: extract text from a password protected PDF Well yes, OCR, obviously. You could also look

Re: extract text from a password protected PDF

2023-10-03 Thread Tilman Hausherr
Well yes, OCR, obviously. You could also look at the source code of ExtractText and decide how you want to handle the permissions 😂 Tilman On 02.10.2023 19:37, Robert Rodini wrote: Hi, I have had great success with PDFBOX Extract. That is until the supplier of the PDF decided to password