mbiso created TIKA-4367:
---------------------------

             Summary: Problem with the: 
org.apache.tika.server.core.ServerStatusWatcher forked process observed TIMEOUT 
and is shutting down
                 Key: TIKA-4367
                 URL: https://issues.apache.org/jira/browse/TIKA-4367
             Project: Tika
          Issue Type: Bug
          Components: tika-server
    Affects Versions: 3.0.0
            Reporter: mbiso


Hi.
i have this problem on my tika-server running in a docker container.

Due to large files, i obtain timeout and the tika process down.
this is the error:

2025-01-16T01:29:19.096206347Z INFO  [qtp274100821-133] 02:29:19,096 
org.apache.tika.server.core.resource.MetadataResource /meta (application/pdf)
2025-01-16T01:29:19.120130385Z INFO  [qtp274100821-270] 02:29:19,120 
org.apache.tika.server.core.resource.TikaResource /tika (application/pdf)
2025-01-16T01:29:19.213411527Z INFO  [qtp274100821-133] 02:29:19,213 
org.apache.tika.server.core.resource.MetadataResource /meta (application/pdf)
2025-01-16T01:29:19.230454549Z INFO  [qtp274100821-270] 02:29:19,230 
org.apache.tika.server.core.resource.TikaResource /tika (application/pdf)
2025-01-16T01:56:18.370380628Z INFO  [qtp274100821-284] 02:56:18,370 
org.apache.tika.server.core.resource.MetadataResource /meta (application/pdf)
2025-01-16T02:01:18.430280014Z ERROR [Thread-11] 03:01:18,428 
org.apache.tika.server.core.ServerStatusWatcher Timeout task PARSE, millis 
elapsed 300055; consider increasing the allowable time with the 
<taskTimeoutMillis/> parameter or the X-Tika-Timeout-Millis header
2025-01-16T02:01:18.437740057Z WARN  [Thread-11] 03:01:18,437 
org.apache.tika.server.core.ServerStatusWatcher forked process observed TIMEOUT 
and is shutting down.
2025-01-16T02:01:18.439693546Z INFO  [Thread-11] 03:01:18,439 
org.apache.tika.server.core.ServerStatusWatcher Shutting down forked process 
with status: TIMEOUT
2025-01-16T02:01:19.851234798Z INFO  [pool-2-thread-1] 03:01:19,817 
org.apache.tika.server.core.TikaServerWatchDog forked process exited with exit 
value 3
2025-01-16T02:01:20.644728948Z INFO  [main] 03:01:20,643 
org.apache.tika.server.core.TikaServerProcess Starting Apache Tika 3.0.0 server
2025-01-16T02:01:20.773526359Z INFO  [main] 03:01:20,772 
org.apache.tika.server.core.TikaServerProcess Using custom config: 
/tika-config.xml
2025-01-16T02:01:21.358160073Z INFO  [main] 03:01:21,357 
org.apache.tika.server.core.TikaServerProcess loading resource from SPI: class 
org.apache.tika.server.standard.resource.XMPMetadataResource
2025-01-16T02:01:21.527210481Z Jan 16, 2025 3:01:21 AM 
org.apache.cxf.endpoint.ServerImpl initDestination
2025-01-16T02:01:21.527237406Z INFO: Setting the server's publish address to be 
http://0.0.0.0:9998/
2025-01-16T02:01:21.627014872Z INFO  [main] 03:01:21,626 
org.eclipse.jetty.server.Server jetty-11.0.24; built: 2024-08-26T18:11:22.448Z; 
git: 5dfc59a691b748796f922208956bd1f2794bcd16; jvm 
21.0.5+11-Ubuntu-1ubuntu124.04
2025-01-16T02:01:21.685264827Z INFO  [main] 03:01:21,684 
org.eclipse.jetty.server.AbstractConnector Started 
ServerConnector@50b1f030\{HTTP/1.1, (http/1.1)}{0.0.0.0:9998}
2025-01-16T02:01:21.687671013Z INFO  [main] 03:01:21,687 
org.eclipse.jetty.server.Server Started 
Server@6034e75d\{STARTING}[11.0.24,sto=0] @1755ms
2025-01-16T02:01:21.711747262Z INFO  [main] 03:01:21,711 
org.eclipse.jetty.server.handler.ContextHandler Started 
o.a.c.t.h.JettyContextHandler@56febdc\{/,null,AVAILABLE}
2025-01-16T02:01:21.716535893Z INFO  [main] 03:01:21,716 
org.apache.tika.server.core.TikaServerProcess Started Apache Tika server 
5598029c-6de7-4b53-8284-0f18814c049f at [http://0.0.0.0:9998/]

My issue is, because ManifoldCF uses tika to parse the files, the ManifoldCF 
job ends with: "Error: Repeated service interruptions - failure processing 
document: The target server failed to respond"



Is there a way to avoid the shutdown of tika process for timeout?

 

Thanks a lot

Mario



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to