PDF files are read by PDFBox library. You may want to look into that area as well.
On Wednesday, September 28, 2022 at 10:52:15 PM UTC-5 Quan Nguyen wrote: > The source of tess4j is available; you can trace through the code to see > what threw the exception. > > Nevertheless, "throwable while reading PDF" seems to point to the part of > code that reads in PDF file. Was that something you wrote, or from tess4j > itself? > > On Sunday, September 25, 2022 at 11:02:35 AM UTC-5 rcja...@gmail.com > wrote: > >> I'm using Tess4j in a Java program to access Tesseract and read PDFs >> read with PDFBox. I've been using Java 8, and things are running. The >> program is not commercial; I provide it to non-profits doing pro bono legal >> work in my state. In java 8 using the command line and eclipse, the program >> runs fine; running from the command line in either Java 11 or Java 17 >> causes an error at the point where the program calls Tesseract.doOCR(). >> >> I've dumped class loading information and see that last class loaded >> before the fatal exception is com.sun.jna.Platform; it would be used, for >> instance, to determine the platform on which the program is running. I >> haven't been able to find the source for the 5.2 version I downloaded from >> UB Mannheim, that would be useful since the stack trace has line numbers. >> >> The following is a snippet showing log messages, System.out.println >> messages, stacktraces, and class loading messages near the point of failure: >> >> pdfRenderer created buffered Image >> set a couple of tesseract vars >> [14.960s][info][class,load] net.sourceforge.tess4j.util.ImageIOHelper >> source: rsrc:tess4j-5.4.0.jar >> [14.961s][info][class,load] javax.imageio.IIOParam source: >> jrt:/java.desktop >> [14.961s][info][class,load] javax.imageio.ImageWriteParam source: >> jrt:/java.desktop >> [14.962s][info][class,load] >> com.github.jaiimageio.plugins.tiff.TIFFImageWriteParam source: >> rsrc:jai-imageio-core-1.4.0.jar >> [14.963s][info][class,load] javax.imageio.IIOImage source: >> jrt:/java.desktop >> [14.964s][info][class,load] com.sun.jna.Library source: >> rsrc:jna-5.12.1.jar >> [14.965s][info][class,load] net.sourceforge.tess4j.ITessAPI source: >> rsrc:tess4j-5.4.0.jar >> [14.965s][info][class,load] net.sourceforge.tess4j.TessAPI source: >> rsrc:tess4j-5.4.0.jar >> [14.966s][info][class,load] net.sourceforge.tess4j.util.LoadLibs source: >> rsrc:tess4j-5.4.0.jar >> [14.969s][info][class,load] com.sun.jna.Platform source: >> rsrc:jna-5.12.1.jar >> [14.973s][info][class,load] java.lang.ExceptionInInitializerError source: >> jrt:/java.base >> throwable while reading PDF >> [14.973s][info][class,load] java.lang.Throwable$PrintStreamOrWriter >> source: jrt:/java.base >> [14.974s][info][class,load] java.lang.Throwable$WrappedPrintStream >> source: jrt:/java.base >> java.lang.ExceptionInInitializerError >> at net.sourceforge.tess4j.Tesseract.init(Tesseract.java:442) >> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:326) >> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:309) >> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:290) >> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:274) >> at >> drivingrecordtool.file.DrivingRecordPDFTextReader.getOCRText(DrivingRecordPDFTextReader.java:152) >> at >> drivingrecordtool.file.DrivingRecordPDFTextReader.getText(DrivingRecordPDFTextReader.java:46) >> at >> drivingrecordtool.file.DrivingRecordFileReader.doInBackground(DrivingRecordFileReader.java:78) >> at >> drivingrecordtool.file.DrivingRecordFileReader.doInBackground(DrivingRecordFileReader.java:1) >> at >> java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:304) >> at >> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) >> at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:343) >> at >> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) >> at >> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) >> at java.base/java.lang.Thread.run(Thread.java:834) >> Caused by: java.lang.IllegalStateException: zip file closed >> at java.base/java.util.zip.ZipFile.ensureOpen(ZipFile.java:913) >> at java.base/java.util.zip.ZipFile.getEntry(ZipFile.java:348) >> >> If I uninstall Java and install Java 8, the program works fine. >> >> If I uninstall Java and install Java 11 or Java 17, it fails in this >> fashion. >> >> Can anyone help me understand what the difference might be between the >> versions of Java so I can fix this? >> >> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ca4b42c0-e933-4357-ae17-dc77b92ee9a8n%40googlegroups.com.