HeartSaVioR opened a new pull request, #50460:
URL: https://github.com/apache/spark/pull/50460

   ### What changes were proposed in this pull request?
   
   This PR proposes to disable Nagle's algorithm (TCP_NODELAY = true) for the 
connection between Python worker and state server, in TWS + PySpark.
   
   ### Why are the changes needed?
   
   We have observed the consistent latency, which is almost slightly more than 
40ms, from specific state interactions. e.g. ListState.put() / ListState.get() 
/ ListState.appendList().
   
   The root cause is figured out as the combination of Nagle's algorithm and 
delayed ACK. The sequence is following:
   
   1. Python worker sends the proto message to JVM, and flushes the socket.
   2. Additionally, Python worker sends the follow-up data to JVM, and flushes 
the socket.
   3. JVM reads the proto message, and realizes there is follow-up data.
   4. JVM reads the follow-up data.
   5. JVM processes the request, and sends the response back to Python worker.
   
   Due to delayed ACK, even after 3, ACK is not sent back from JVM to Python 
worker. It is waiting for some data or multiple ACKs to be sent, but JVM is not 
going to send the data during that phase.
   
   Due to Nagle's algorithm, the message from 2 is not sent to JVM since there 
is no ACK for the message from 1.
   
   This deadlock situation is resolved after the timeout of delayed ACK, which 
is 40ms (minimum duration) in Linux. After the timeout, ACK is sent back from 
JVM to Python worker, hence Nagle's algorithm allows the message from 2 to be 
finally sent to JVM.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Manually tested (via adding debug log to measure the time spent from the 
state interaction).
   
   Beyond that, this should pass the existing tests, which will be verified by 
CI.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to