PDF files are read by PDFBox library. You may want to look into that area 
as well.

On Wednesday, September 28, 2022 at 10:52:15 PM UTC-5 Quan Nguyen wrote:

> The source of tess4j is available; you can trace through the code to see 
> what threw the exception.
>
> Nevertheless, "throwable while reading PDF" seems to point to the part of 
> code that reads in PDF file. Was that something you wrote, or from tess4j 
> itself?
>
> On Sunday, September 25, 2022 at 11:02:35 AM UTC-5 rcja...@gmail.com 
> wrote:
>
>> I'm using Tess4j in a Java program to access Tesseract and read  PDFs 
>> read with PDFBox. I've been using Java 8, and things are running. The 
>> program is not commercial; I provide it to non-profits doing pro bono legal 
>> work in my state. In java 8 using the command line and eclipse, the program 
>> runs fine; running from the command line in either Java 11 or Java 17 
>> causes an error at the point where the program calls Tesseract.doOCR().
>>
>> I've dumped class loading information and see that last class loaded 
>> before the fatal exception is com.sun.jna.Platform; it would be used, for 
>> instance, to determine the platform on which the program is running. I 
>> haven't been able to find the source for the 5.2 version I downloaded from 
>> UB Mannheim, that would be useful since the stack trace has line numbers.
>>
>> The following is a snippet showing log messages, System.out.println 
>> messages, stacktraces, and class loading messages near the point of failure:
>>
>> pdfRenderer created buffered Image
>> set a couple of tesseract vars
>> [14.960s][info][class,load] net.sourceforge.tess4j.util.ImageIOHelper 
>> source: rsrc:tess4j-5.4.0.jar
>> [14.961s][info][class,load] javax.imageio.IIOParam source: 
>> jrt:/java.desktop
>> [14.961s][info][class,load] javax.imageio.ImageWriteParam source: 
>> jrt:/java.desktop
>> [14.962s][info][class,load] 
>> com.github.jaiimageio.plugins.tiff.TIFFImageWriteParam source: 
>> rsrc:jai-imageio-core-1.4.0.jar
>> [14.963s][info][class,load] javax.imageio.IIOImage source: 
>> jrt:/java.desktop
>> [14.964s][info][class,load] com.sun.jna.Library source: 
>> rsrc:jna-5.12.1.jar
>> [14.965s][info][class,load] net.sourceforge.tess4j.ITessAPI source: 
>> rsrc:tess4j-5.4.0.jar
>> [14.965s][info][class,load] net.sourceforge.tess4j.TessAPI source: 
>> rsrc:tess4j-5.4.0.jar
>> [14.966s][info][class,load] net.sourceforge.tess4j.util.LoadLibs source: 
>> rsrc:tess4j-5.4.0.jar
>> [14.969s][info][class,load] com.sun.jna.Platform source: 
>> rsrc:jna-5.12.1.jar
>> [14.973s][info][class,load] java.lang.ExceptionInInitializerError source: 
>> jrt:/java.base
>> throwable while reading PDF
>> [14.973s][info][class,load] java.lang.Throwable$PrintStreamOrWriter 
>> source: jrt:/java.base
>> [14.974s][info][class,load] java.lang.Throwable$WrappedPrintStream 
>> source: jrt:/java.base
>> java.lang.ExceptionInInitializerError
>>         at net.sourceforge.tess4j.Tesseract.init(Tesseract.java:442)
>>         at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:326)
>>         at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:309)
>>         at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:290)
>>         at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:274)
>>         at 
>> drivingrecordtool.file.DrivingRecordPDFTextReader.getOCRText(DrivingRecordPDFTextReader.java:152)
>>         at 
>> drivingrecordtool.file.DrivingRecordPDFTextReader.getText(DrivingRecordPDFTextReader.java:46)
>>         at 
>> drivingrecordtool.file.DrivingRecordFileReader.doInBackground(DrivingRecordFileReader.java:78)
>>         at 
>> drivingrecordtool.file.DrivingRecordFileReader.doInBackground(DrivingRecordFileReader.java:1)
>>         at 
>> java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:304)
>>         at 
>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>         at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:343)
>>         at 
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>>         at 
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>>         at java.base/java.lang.Thread.run(Thread.java:834)
>> Caused by: java.lang.IllegalStateException: zip file closed
>>         at java.base/java.util.zip.ZipFile.ensureOpen(ZipFile.java:913)
>>         at java.base/java.util.zip.ZipFile.getEntry(ZipFile.java:348)
>>
>> If I uninstall Java and install Java 8, the program works fine.
>>
>> If I uninstall Java and install Java 11 or Java 17, it fails in this 
>> fashion.
>>
>> Can anyone help me understand what the difference might be between the 
>> versions of Java so I can fix this?
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ca4b42c0-e933-4357-ae17-dc77b92ee9a8n%40googlegroups.com.

Reply via email to