Well yes, OCR, obviously.
You could also look at the source code of ExtractText and decide how you
want to handle the permissions 😂
Tilman
On 02.10.2023 19:37, Robert Rodini wrote:
Hi,
I have had great success with PDFBOX Extract. That is until the supplier of
the PDF decided to password
Tesseract/Tess4J is a good OCR combo, Tess4j uses PDFBOX for pdf for pdf2imgs
-Original Message-
From: Tilman Hausherr
Sent: Tuesday, 3 October 2023 10:05
To: users@pdfbox.apache.org
Subject: Re: extract text from a password protected PDF
Well yes, OCR, obviously.
You could also look a
Hi, here is the repository with test/reproduce code:
https://github.com/padisah/pdfboxtests
Here I am reproducing a character displacement problem: text that includes
'-' sign, they are shifted from position.
There will be more cases added, with missing content.
On Tue, Sep 26, 2023 at 3:04 PM
3 matches
Mail list logo