Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23730 )

Change subject: IMPALA-14566: Add euclidean_distance and cosine_similarity 
functions for ARRAY<FLOAT>
......................................................................


Patch Set 10:

(10 comments)

Thanks for working on this!

http://gerrit.cloudera.org:8080/#/c/23730/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23730/10//COMMIT_MSG@9
PS10, Line 9: ations:
nit: commit message body should be no longer than 72 chars


http://gerrit.cloudera.org:8080/#/c/23730/10/be/src/exprs/vector-functions-ir.cc
File be/src/exprs/vector-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/23730/10/be/src/exprs/vector-functions-ir.cc@49
PS10, Line 49: tuple_size
Is tuple_size equals TUPLE_SIZE_FOR_FLOAT_ARRAY? In that case we could just 
remove the parameter, and use TUPLE_SIZE_FOR_FLOAT_ARRAY.

Same goes for SLOT_OFFSET.


http://gerrit.cloudera.org:8080/#/c/23730/10/be/src/exprs/vector-functions-ir.cc@51
PS10, Line 51:   // This function should only be called after checking for NULL 
elements.
             :   // If an element is NULL, the calling function should return 
NULL instead.
             :   DCHECK(!IsArrayElementNull(array_ptr, index, tuple_size, 
NULL_INDICATOR_OFFSET));
nit: You could move this to the beginning.


http://gerrit.cloudera.org:8080/#/c/23730/10/be/src/exprs/vector-functions-ir.cc@61
PS10, Line 61: int null_indicator_offset
Is it possible to call it with any other value than NULL_INDICATOR_OFFSET? If 
no, we could remove this argument.


http://gerrit.cloudera.org:8080/#/c/23730/10/be/src/exprs/vector-functions-ir.cc@64
PS10, Line 64:   // if Bit 0 = 1 then element is NULL
             :   //If Bit 0 = 0 then the element is NOT NULL
nit: could be just:

 // Bit 0 is the null-indicator bit


http://gerrit.cloudera.org:8080/#/c/23730/10/be/src/exprs/vector-functions-ir.cc@72
PS10, Line 72:   if (vec1.is_null || vec2.is_null) {
             :     return false;
             :   }
nit: fits single line

Same goes for the following 2 ifs as well


http://gerrit.cloudera.org:8080/#/c/23730/10/be/src/exprs/vector-functions-ir.cc@89
PS10, Line 89: s
nit: missing space


http://gerrit.cloudera.org:8080/#/c/23730/10/be/src/exprs/vector-functions.h
File be/src/exprs/vector-functions.h:

http://gerrit.cloudera.org:8080/#/c/23730/10/be/src/exprs/vector-functions.h@32
PS10, Line 32: //
nit: should be ///


http://gerrit.cloudera.org:8080/#/c/23730/10/be/src/exprs/vector-functions.h@53
PS10, Line 53: 0.0 if the element is NULL
I don't think it's guarenteed that 0.0 is returned in such case. Anyway, it 
should never happen as GetFloatFromArray() is only called if none of the 
elements are null.


http://gerrit.cloudera.org:8080/#/c/23730/10/testdata/workloads/functional-query/queries/QueryTest/vector-distance-functions.test
File 
testdata/workloads/functional-query/queries/QueryTest/vector-distance-functions.test:

http://gerrit.cloudera.org:8080/#/c/23730/10/testdata/workloads/functional-query/queries/QueryTest/vector-distance-functions.test@228
PS10, Line 228: ====
You should add negative tests for the case when 
euclidean_distance/cosine_similarity is invoked on different types such as 
scalar values, complex types other than ARRAY<FLOAT>. In such cases we just 
expect a useful error message.



--
To view, visit http://gerrit.cloudera.org:8080/23730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id305acc87530d5d0e53613fe8df9a631ea4e1080
Gerrit-Change-Number: 23730
Gerrit-PatchSet: 10
Gerrit-Owner: Raghav Jindal <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Pranav Lodha <[email protected]>
Gerrit-Reviewer: Raghav Jindal <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 08 Jan 2026 16:49:31 +0000
Gerrit-HasComments: Yes

Reply via email to