Text sequence of ExtractText utility

2023-05-19 Thread Robert Rodini
Hi, I have successfully used PDFBox ExtractText utility to process PDFs produced by a third-party. The text comes out of a multicolumn PDF in the left to right order of the columns from top to bottom. I now have to process PDFs produced by another third-party which also produces a multicolumn

Re: Text sequence of ExtractText utility

2023-05-23 Thread Robert Rodini
: users@pdfbox.apache.org Subject: Re: Text sequence of ExtractText utility Hi, You can try the "-sort" option. Sometimes this helps. Tilman [cid:part1.MGjatbHN.HW6Dh5r7@t-online.de] On 19.05.2023 15:17, Robert Rodini wrote: Hi, I have successfully used PDFBox ExtractText utility

extract text from a password protected PDF

2023-10-02 Thread Robert Rodini
Hi, I have had great success with PDFBOX Extract. That is until the supplier of the PDF decided to password protect the file from extraction. Does anyone know of any tools or techniques (e.g. OCR) that might help me extract the text? Thanks, Bob R

split a password protected file

2024-03-20 Thread Robert Rodini
Can PDFSplit split up a password-protected file.? It seems that it cannot, but there is no error message. P.S. I am using v. 2.x of PDFBox. I will upgrade soon. Bob Rodini

Re: split a password protected file

2024-03-21 Thread Robert Rodini
Does this mean that splitting a password protected PDF effectively disables password protection? From: Tilman Hausherr Sent: Wednesday, March 20, 2024 11:32 AM To: users@pdfbox.apache.org Subject: Re: split a password protected file On 20.03.2024 16:24, Robert

Re: split a password protected file

2024-03-21 Thread Robert Rodini
Actually, that behavior works fine for me. --Bob From: Tilman Hausherr Sent: Thursday, March 21, 2024 2:10 PM To: users@pdfbox.apache.org Subject: Re: split a password protected file On 21.03.2024 18:59, Robert Rodini wrote: > Does this mean that splittin

Thank you!

2024-12-25 Thread Robert Rodini
I want to thank Tilman for his dedication to the PDF Box community. Great work! Merry Christmas and Happy New Year, Bob Rodini PDF Box user

detection of column breaks and page breaks in PDF document

2025-05-23 Thread Robert Rodini
This question is informational. I use PDFBox utilities to extract text from a large PDF file. The pages of the PDF always contain a three-column format. PDF Box CLI utility is wonderful since it processes the columns from top to bottom and left to right. Is there a way to use Apache PDF Box t