mihailom-db opened a new pull request, #50967:
URL: https://github.com/apache/spark/pull/50967

   ### What changes were proposed in this pull request?
   Fix to `map_zip_with` expression while handling floating point numbers.
   
   
   ### Why are the changes needed?
   Previously we would run `getKeysWithIndexesFast` which would use 
LinkedHashMap, which does not use proper equality on keys for floating point 
numbers. All NaNs would be treated in a different way. This PR aims to fix this 
behaviour.
   
   Example:
   ```
   select map_zip_with(map(float('NaN'), 1), map(float('NaN'), 2), (k, v1, v2) 
-> (v1, v2))
   ```
   
   Output before:
   ```
   {"NaN":{"v1":1,"v2":null},"NaN":{"v1":null,"v2":2}}
   ```
   
   Output after:
   ```
   {"NaN":{"v1":1,"v2":2}}
   ```
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, fixing the way expression works.
   
   
   ### How was this patch tested?
   Added tests to golden files.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to