Raghav Jindal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23730 )

Change subject: IMPALA-14566: Add euclidean_distance and cosine_similarity 
functions for ARRAY<FLOAT>
......................................................................


Patch Set 5:

(2 comments)

> (5 comments)

http://gerrit.cloudera.org:8080/#/c/23730/5/be/src/exprs/vector-functions-ir.cc
File be/src/exprs/vector-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/23730/5/be/src/exprs/vector-functions-ir.cc@116
PS5, Line 116:   // Handle NULL inputs
             :   if (vec1.is_null || vec2.is_null) {
             :     return DoubleVal::null();
             :   }
             :
             :   // Handle empty arrays
             :   if (vec1.num_tuples == 0 || vec2.num_tuples == 0) {
             :     ctx->SetError("Cosine similarity requires non-empty 
vectors");
             :     return DoubleVal::null();
             :   }
             :
             :   // Validate that vectors have the same length
             :   if (vec1.num_tuples != vec2.num_tuples) {
             :     ctx->SetError(
             :         "Vectors must have the same length for similarity 
calculation");
             :     return DoubleVal::null();
             :   }
> It seems this logic is quite similar to the checks in EuclideanDistance().
Yes , u are right a helper function can be added for these checks to avoid 
duplication and for reuse. I will add 1 more function validatingvectors in 
header file which can take the function name as well in addition to both the 
vectors as parameters . That way I can pass func name from both eu_dst and 
cosine functions and print the error from that helper func having the function 
name in error trace.


http://gerrit.cloudera.org:8080/#/c/23730/5/be/src/exprs/vector-functions.h
File be/src/exprs/vector-functions.h:

http://gerrit.cloudera.org:8080/#/c/23730/5/be/src/exprs/vector-functions.h@18
PS5, Line 18: #ifndef IMPALA_EXPRS_VECTOR_FUNCTIONS_H
            : #define IMPALA_EXPRS_VECTOR_FUNCTIONS_H
> For new headers, I think we can use "#pragma once" which is cleaner
ok I will replace with this #pragma once .



--
To view, visit http://gerrit.cloudera.org:8080/23730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id305acc87530d5d0e53613fe8df9a631ea4e1080
Gerrit-Change-Number: 23730
Gerrit-PatchSet: 5
Gerrit-Owner: Raghav Jindal <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Pranav Lodha <[email protected]>
Gerrit-Reviewer: Raghav Jindal <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Wed, 03 Dec 2025 11:03:13 +0000
Gerrit-HasComments: Yes

Reply via email to