Jungtaek Lim created SPARK-51667: ------------------------------------ Summary: [TWS + Python] Disable Nagle's algorithm between Python worker and State Server Key: SPARK-51667 URL: https://issues.apache.org/jira/browse/SPARK-51667 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.0.0, 4.1.0 Reporter: Jungtaek Lim
During testing TWS + Python, we figured out the case where the socket communication for state interaction had delayed for more than 40ms, for certain type of state, e.g. ListState.put(), ListState.get(), ListState.appendList(), etcetc. The root cause is figured out as the combination of Nagle's algorithm and delayed ACK. The sequence is following: # Python worker sends the proto message to JVM, and flushes the socket. # Additionally, Python worker sends the follow-up data to JVM, and flushes the socket. # JVM reads the proto message, and realizes there is follow-up data. # JVM reads the follow-up data. # JVM processes the request, and sends the response back to Python worker. Due to delayed ACK, even after 3, ACK is not sent back from JVM to Python worker. It is waiting for some data or multiple ACKs to be sent, but JVM is not going to send the data during that phase. Due to Nagle's algorithm, the message from 2 is not sent to JVM since there is no ACK for the message from 1. This deadlock situation is resolved after the timeout of delayed ACK, which is 40ms (minimum duration) in Linux. After the timeout, ACK is sent back from JVM to Python worker, hence Nagle's algorithm allows the message from 2 to be finally sent to JVM. See below articles for more general explanation: * [https://engineering.avast.io/40-millisecond-bug/] ** Start reading from Nagle's algorithm section * [https://brooker.co.za/blog/2024/05/09/nagle.html] Nagle's algorithm helps to reduce a lot of small packets, which the above article states it could help the router from overloaded. We connect to "localhost" here. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org