Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/23730 )
Change subject: IMPALA-14566: Add euclidean_distance and cosine_similarity functions for ARRAY<FLOAT> ...................................................................... Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/23730/7/be/src/exprs/vector-functions-ir.cc File be/src/exprs/vector-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/23730/7/be/src/exprs/vector-functions-ir.cc@65 PS7, Line 65: memcpy(&value, tuple_ptr + slot_offset, sizeof(float)); I wonder if you could avoid this memcpy by returning a float&, as in return (float) (tuple_ptr + slot_offset); and doing arithmetic with the references. I'm not 100% sure that will work. Also not sure if it'd be faster, but it'd be kind of fun to benchmark. http://gerrit.cloudera.org:8080/#/c/23730/7/be/src/exprs/vector-functions-ir.cc@119 PS7, Line 119: sum_squared_diff += static_cast<double>(diff * diff); This should avoid overflow. Please add some boundary tests to verify that. http://gerrit.cloudera.org:8080/#/c/23730/7/be/src/exprs/vector-functions.h File be/src/exprs/vector-functions.h: http://gerrit.cloudera.org:8080/#/c/23730/7/be/src/exprs/vector-functions.h@46 PS7, Line 46: /// Declaring the Helper functions under private to get a float value nit: this sentence is a bit verbose, declaring helper functions under private is obvious. If you're using significant contributions from generative AI tools, please review https://www.apache.org/legal/generative-tooling.html and add Generated-By: <tool> (<model>) to the commit message. An example would be https://gerrit.cloudera.org/c/22462/20//COMMIT_MSG#23. http://gerrit.cloudera.org:8080/#/c/23730/7/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: http://gerrit.cloudera.org:8080/#/c/23730/7/common/function-registry/impala_functions.py@668 PS7, Line 668: # Vector distance functions for semantic search The description of this format would make more sense at the top of this list, it's not specific to the new functions. -- To view, visit http://gerrit.cloudera.org:8080/23730 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id305acc87530d5d0e53613fe8df9a631ea4e1080 Gerrit-Change-Number: 23730 Gerrit-PatchSet: 7 Gerrit-Owner: Raghav Jindal <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Pranav Lodha <[email protected]> Gerrit-Reviewer: Raghav Jindal <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Tue, 09 Dec 2025 23:44:56 +0000 Gerrit-HasComments: Yes
