Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23730 )

Change subject: IMPALA-14566: Add euclidean_distance and cosine_similarity 
functions for ARRAY<FLOAT>
......................................................................


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/23730/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23730/5//COMMIT_MSG@16
PS5, Line 16:
I think we can add some tests. One place maybe be/src/exprs/expr-test.cc


http://gerrit.cloudera.org:8080/#/c/23730/5/be/src/exprs/vector-functions-ir.cc
File be/src/exprs/vector-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/23730/5/be/src/exprs/vector-functions-ir.cc@35
PS5, Line 35: Followed
nit. followed


http://gerrit.cloudera.org:8080/#/c/23730/5/be/src/exprs/vector-functions-ir.cc@116
PS5, Line 116:   // Handle NULL inputs
             :   if (vec1.is_null || vec2.is_null) {
             :     return DoubleVal::null();
             :   }
             :
             :   // Handle empty arrays
             :   if (vec1.num_tuples == 0 || vec2.num_tuples == 0) {
             :     ctx->SetError("Cosine similarity requires non-empty 
vectors");
             :     return DoubleVal::null();
             :   }
             :
             :   // Validate that vectors have the same length
             :   if (vec1.num_tuples != vec2.num_tuples) {
             :     ctx->SetError(
             :         "Vectors must have the same length for similarity 
calculation");
             :     return DoubleVal::null();
             :   }
It seems this logic is quite similar to the checks in EuclideanDistance(). 
Maybe we can use a helper function for these checks to reduce duplication?


http://gerrit.cloudera.org:8080/#/c/23730/5/be/src/exprs/vector-functions-ir.cc@169
PS5, Line 169: void VectorFunctions::VectorDistancePrepare(FunctionContext* ctx,
             :     FunctionContext::FunctionStateScope scope) {
             : }
             :
             : void VectorFunctions::VectorDistanceClose(FunctionContext* ctx,
             :     FunctionContext::FunctionStateScope scope) {
             : }
Maybe we can remove them if these are not currently being used to avoid dead 
code


http://gerrit.cloudera.org:8080/#/c/23730/5/be/src/exprs/vector-functions.h
File be/src/exprs/vector-functions.h:

http://gerrit.cloudera.org:8080/#/c/23730/5/be/src/exprs/vector-functions.h@18
PS5, Line 18: #ifndef IMPALA_EXPRS_VECTOR_FUNCTIONS_H
            : #define IMPALA_EXPRS_VECTOR_FUNCTIONS_H
For new headers, I think we can use "#pragma once" which is cleaner



--
To view, visit http://gerrit.cloudera.org:8080/23730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id305acc87530d5d0e53613fe8df9a631ea4e1080
Gerrit-Change-Number: 23730
Gerrit-PatchSet: 5
Gerrit-Owner: Raghav Jindal <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Pranav Lodha <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Wed, 03 Dec 2025 10:14:37 +0000
Gerrit-HasComments: Yes

Reply via email to