mihailom-db opened a new pull request, #50967: URL: https://github.com/apache/spark/pull/50967
### What changes were proposed in this pull request? Fix to `map_zip_with` expression while handling floating point numbers. ### Why are the changes needed? Previously we would run `getKeysWithIndexesFast` which would use LinkedHashMap, which does not use proper equality on keys for floating point numbers. All NaNs would be treated in a different way. This PR aims to fix this behaviour. Example: ``` select map_zip_with(map(float('NaN'), 1), map(float('NaN'), 2), (k, v1, v2) -> (v1, v2)) ``` Output before: ``` {"NaN":{"v1":1,"v2":null},"NaN":{"v1":null,"v2":2}} ``` Output after: ``` {"NaN":{"v1":1,"v2":2}} ``` ### Does this PR introduce _any_ user-facing change? Yes, fixing the way expression works. ### How was this patch tested? Added tests to golden files. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org