Sam Williams created NIFI-9463:
----------------------------------

             Summary: Large file downloads timeout
                 Key: NIFI-9463
                 URL: https://issues.apache.org/jira/browse/NIFI-9463
             Project: Apache NiFi
          Issue Type: Bug
          Components: Core Framework
    Affects Versions: 1.15.0, 1.12.1
         Environment: Centos 7, Docker, 3-node cluster, SSL, certificate 
authentication, JVM Heap 4GB
            Reporter: Sam Williams


When attempting to download large files (greater than 500MB) from a queue or 
from provenance, the request will timeout and the file will not download. The 
HTTP response from NiFi is:

 
{code:java}
HTTP ERROR 503: Service Unavailable
URI: /nifi-api/flowfile-queues/<queue-id>/flowfiles/<flowfile-id>/content
STATUS: 503
MESSAGE: Service Unavailable
SERVLET: jerseySpring
{code}
 

 

 
{code:java}
nifi-app.log:
<DTG> WARN [Replicate Request Thread-1337] 
o.a.n.c.c.h.r.ThreadPoolRequestReplicator
java.ne.SocketTimeoutException: timeout
<...>
{code}
 
{code:java}
nifi.properties:
nifi.cluster.node.connection.timeout=120 secs
nifi.cluster.node.read.timeout=120 secs
nifi.web.request.timeout=120 secs{code}
 

As I have been increasing the timeout values and the JVM heap size, I have 
managed to download larger and larger files, but this does not seem to be a 
linear phenomenon (i.e. 500MB might take ~30sec, while 600MB will take ~90sec 
to download)

This has been happening since at least 1.12.0, and I believe it to relate to 
the implementation of the Jersey client [NIFI-5112] Inefficiency in replicating 
requests across cluster - ASF JIRA (apache.org)

My guess would be the flowfile content is being streamed back to the node 
serving the UI which is buffering the content in memory and then streaming to 
the client.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to