Re: [PR] [SPARK-50735][CONNECT] Failure in ExecuteResponseObserver results in infinite reattaching requests [spark]

via GitHub Mon, 13 Jan 2025 05:01:24 -0800


changgyoopark-db commented on code in PR #49370:
URL: https://github.com/apache/spark/pull/49370#discussion_r1913157644



##########
sql/connect/server/src/main/scala/org/apache/spark/sql/connect/execution/ExecuteGrpcResponseSender.scala:
##########
@@ -319,7 +325,14 @@ private[connect] class ExecuteGrpcResponseSender[T <: 
Message](
           log"waitingForResults=${MDC(WAIT_RESULT_TIME, consumeSleep / 
NANOS_PER_MILLIS.toDouble)} ms " +
           log"waitingForSend=${MDC(WAIT_SEND_TIME, sendSleep / 
NANOS_PER_MILLIS.toDouble)} ms")
         // scalastyle:on line.size.limit
-        grpcObserver.onCompleted()
+        if (executionCompleted && !sizeLimitReached) {

Review Comment:
   Nope, it's not about concurrency concerns, but for cases like that, the 
result is too big for a single RPC.
   -> The fact that the size limit has been reached means that the client can 
still make progress just by consuming the accumulated data whether or not the 
execute thread runner is present.
   -> If execution is completed quickly and the result is enormous, we must not 
return an error but allow the client to consume the result in the next reattach 
call.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-50735][CONNECT] Failure in ExecuteResponseObserver results in infinite reattaching requests [spark]

Reply via email to