Re: [PR] [SPARK-51016][SQL] Non-deterministic SQL expressions should set indeterminate map stage output level [spark]

via GitHub Wed, 30 Apr 2025 13:24:35 -0700


peter-toth commented on code in PR #50757:
URL: https://github.com/apache/spark/pull/50757#discussion_r2069390537



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -103,13 +103,21 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
     AttributeSet(expressions) -- producedAttributes)
 
   /**
-   * Returns true when the all the expressions in the current node as well as 
all of its children
+   * Returns true when the current node (all the expressions in it) is 
deterministic.
+   */
+  def deterministicNode: Boolean = _deterministicNode()
+
+  private val _deterministicNode =
+    new BestEffortLazyVal[JBoolean](() => expressions.forall(_.deterministic))
+
+  /**
+   * Returns true when all the expressions in the current node as well as all 
of its children
    * are deterministic
    */
   def deterministic: Boolean = _deterministic()
 
-  private val _deterministic = new BestEffortLazyVal[JBoolean](() =>
-    expressions.forall(_.deterministic) && children.forall(_.deterministic))
+  private val _deterministic =
+    new BestEffortLazyVal[JBoolean](() => deterministicNode && 
children.forall(_.deterministic))

Review Comment:
   This PR doesn't change `deterministic()` calculation of plan nodes, just 
extracts part of that logic to a new `deterministicNode()` method to check a 
node itself introduces indeterminism.
   That method is then used in `ShuffleExchangeExec.isDeterministicStage()` to 
check any of the descendant nodes below a `ShuffleExchangeExec` until the next 
exchange (stage boundary) introduces indeterminism.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-51016][SQL] Non-deterministic SQL expressions should set indeterminate map stage output level [spark]

Reply via email to