Jihoon Son created SPARK-51772:
----------------------------------
Summary: Use of `copy()` instead of `withNewChildren` to create a
copy of `TreeNode` with new children
Key: SPARK-51772
URL: https://issues.apache.org/jira/browse/SPARK-51772
Project: Spark
Issue Type: Bug
Components: Optimizer
Affects Versions: 4.1.0
Reporter: Jihoon Son
The `TreeNode` class provides `withNewChildren()` function to create a copy of
the node with new children. `withNewChildren()` does not only perform the copy
of the node, but also updates the origin and copies the tags. However, I am
seeing direct uses of `copy()` instead of `withNewChildren()` such as
[here|[https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala#L1850].]
I actually see quite many cases of this pattern. When I searched for the
pattern `.copy(child = `, I get 115 cases. `.copy(children = ` adds another 18
cases. I'm not sure whether this is intended or not.
The reason I started looking into this is that I am developing something that
relies on the tags in the `LogicalPlan`, and I realized that the tags are
missing sometimes after some optimizations. I wonder whether the tagging is not
a stable API for `LogicalPlan`.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]