[PR] [SPARK-48922][SQL][3.5] Avoid redundant array transform of identical expression for map type [spark]

via GitHub Wed, 12 Mar 2025 20:17:40 -0700


wForget opened a new pull request, #50265:
URL: https://github.com/apache/spark/pull/50265


   ### What changes were proposed in this pull request?
   
   Similar to #47843, this patch avoids ArrayTransform in `resolveMapType` 
function if the resolution expression is the same as input param.
   
   ### Why are the changes needed?
   
   My previous pr #47381 was not merged, but I still think it is an 
optimization, so I reopened it.
   
   During the upgrade from Spark 3.1.1 to 3.5.0, I found a performance 
regression in map type inserts.
   
   There are some extra conversion expressions in project before insert, which 
doesn't seem to be always necessary.
   
   ```
   map_from_arrays(transform(map_keys(map#516), lambdafunction(lambda key#652, 
lambda key#652, false)), transform(map_values(map#516), lambdafunction(lambda 
value#654, lambda value#654, false))) AS map#656
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   added unit test
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No
   
   Closes #50245 from wForget/SPARK-48922.
   
   Authored-by: wforget <643348...@qq.com>
   Signed-off-by: beliefer <belie...@163.com>
   
   (cherry picked from commit 1be108eedb832a3684fcc55ec15581a1347475f4)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[PR] [SPARK-48922][SQL][3.5] Avoid redundant array transform of identical expression for map type [spark]

Reply via email to