Tilman Hausherr created TIKA-4274:
-------------------------------------

             Summary: Improve ExtractReaderException
                 Key: TIKA-4274
                 URL: https://issues.apache.org/jira/browse/TIKA-4274
             Project: Tika
          Issue Type: Improvement
          Components: tika-eval
    Affects Versions: 2.9.2
            Reporter: Tilman Hausherr
            Assignee: Tilman Hausherr
             Fix For: 3.0.0, 2.9.3


I saw this stack trace in the eval log and it's not really helpful
{noformat}
org.apache.tika.eval.app.io.ExtractReaderException
        at 
org.apache.tika.eval.app.io.ExtractReader.loadExtract(ExtractReader.java:125)
        at 
org.apache.tika.eval.app.ExtractComparer.compareFiles(ExtractComparer.java:198)
        at 
org.apache.tika.eval.app.ExtractComparer.processFileResource(ExtractComparer.java:180)
        at 
org.apache.tika.batch.FileResourceConsumer._processFileResource(FileResourceConsumer.java:152)
        at 
org.apache.tika.batch.FileResourceConsumer.call(FileResourceConsumer.java:87)
        at 
org.apache.tika.batch.FileResourceConsumer.call(FileResourceConsumer.java:50)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
{noformat}
so I'm adding the type, the cause and also some logging for 
EXTRACT_FILE_TOO_SHORT / EXTRACT_FILE_TOO_LONG so that we can know what this is 
about, and then do something (or not) about it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to