ahshahid commented on PR #50757: URL: https://github.com/apache/spark/pull/50757#issuecomment-2845342978
IMHO the issue of inDeterministic value of an expression should be looked only from the basis of whether ShuffleStage can loose/add row because of inDeterministic nature of the expression.. Beyond that if we are doing anything , then that is sort of trying to put an order/ or expected outcome to an inDeterministic expression's value, which is a contradiction to the meaning of inDeterminancy. If Roundrobin partitioning is causing trouble, then that is not an issue of inDeterminancy, as if it was, then we are expecting a certain expected behaviour from an inDeterminancy, which is contradicting the definition. Also, to know if an stage is inDeterminate, it need not look into the entire QueryPlan tree within the stage, that should be based on the output expressions of the stage ( whether any expression is using an inDtereminant component) and only restricted to whether stage is making use of that inDeterminant component in partitioning. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org