Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23493 )

Change subject: IMPALA-14472: Add create/read support for ARRAY column of Kudu
......................................................................


Patch Set 2: Code-Review+1

(12 comments)

overall looks good to me, just a few questions and nits

http://gerrit.cloudera.org:8080/#/c/23493/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23493/1//COMMIT_MSG@10
PS1, Line 10: intial
nit: initial


http://gerrit.cloudera.org:8080/#/c/23493/1//COMMIT_MSG@10
PS1, Line 10: support
            : to create and select Kudu table with array column type
nit: maybe, replace with
  support for working with Kudu tables having array type columns


http://gerrit.cloudera.org:8080/#/c/23493/1/be/src/exec/kudu/kudu-array-inserter.cc
File be/src/exec/kudu/kudu-array-inserter.cc:

http://gerrit.cloudera.org:8080/#/c/23493/1/be/src/exec/kudu/kudu-array-inserter.cc@45
PS1, Line 45: const char* KUDU_MASTER_DEFAULT_ADDR = "localhost:7051"; // Same 
as in tests/conftest.py
            : const char* KUDU_TEST_TABLE_NAME = 
"impala::functional_kudu.kudu_array";
nit: these might be 'constexpr const char* const'


http://gerrit.cloudera.org:8080/#/c/23493/1/be/src/exec/kudu/kudu-array-inserter.cc@79
PS1, Line 79: KUDU_ASSERT_OK
nit: is it meant to be KUDU_RETURN_NOT_OK or KUDU_CHECK_OK instead?  Since this 
isn't running in gtest environment, I'd rather use KUDU_RETURN_NOT_OK and make 
functions return kudu::Status


http://gerrit.cloudera.org:8080/#/c/23493/1/be/src/exec/kudu/kudu-array-inserter.cc@124
PS1, Line 124:   vector<KuduError*> errors;
             :   bool overflowed;
             :   session->GetPendingErrors(&errors, &overflowed);
Since this interface was designed to be C++98-compatible (i.e. no 
std::unique_ptr is available), if getting pending errors like this, it's 
necessary to deallocate/free the memory if any KuduError is returned.  This 
code snippet might serve as a reference (AFAIK, Impala's code also has 
ElementDeleter in gutil/stl_util.h): 
https://github.com/apache/kudu/blob/16689973a72e03649898c568d7ab423bc4bb8a35/src/kudu/client/client-test.cc#L2096-L2099


http://gerrit.cloudera.org:8080/#/c/23493/1/be/src/exec/kudu/kudu-array-inserter.cc@129
PS1, Line 129:     KUDU_EXPECT_OK(error->status());
This doesn't make much sense: if there were errors, non of the statuses would 
be Status::OK.  Or this is just to print out the information on the errors?


http://gerrit.cloudera.org:8080/#/c/23493/1/be/src/exec/kudu/kudu-array-inserter.cc@137
PS1, Line 137:   return 0;
Does it make sense returning non-zero status if anything went wrong?


http://gerrit.cloudera.org:8080/#/c/23493/1/be/src/exec/kudu/kudu-scanner.cc
File be/src/exec/kudu/kudu-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/23493/1/be/src/exec/kudu/kudu-scanner.cc@402
PS1, Line 402:  else {
readability nit: 'else' isn't necessary since 'if' above contains 'return'


http://gerrit.cloudera.org:8080/#/c/23493/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/23493/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@496
PS1, Line 496:             "Cannot create table '%s': Type %s is not supported 
in Kudu",
             :             getTbl(), col.getType().toSql()));
It seems 3rd parameter is missing (given the format string).


http://gerrit.cloudera.org:8080/#/c/23493/1/fe/src/main/java/org/apache/impala/util/KuduUtil.java
File fe/src/main/java/org/apache/impala/util/KuduUtil.java:

http://gerrit.cloudera.org:8080/#/c/23493/1/fe/src/main/java/org/apache/impala/util/KuduUtil.java@441
PS1, Line 441: } else {
readability nit: it's possible to omit the 'else' part of the clause because 
there 'return' in the 'if' part above


http://gerrit.cloudera.org:8080/#/c/23493/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/23493/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@394
PS1, Line 394:     AnalyzesOk("create table tab (x ARRAY<INT> primary key) " +
             :         "partition by hash(x) partitions 3 stored as kudu",
             :         isExternalPurgeTbl
As of now, I'm not sure Kudu actually works as expected with array columns 
being part of primary key.  Probably, at Kudu side we will need to add a 
guardrail to explicitly tell it's not an option.  I'll clarify on that and keep 
you posted.

I guess it's OK to keep this for a while: maybe, it will be quite easy to add 
the missing functionality for that, and instead of reporting explicit error on 
such a DDL statement, the underlying Kudu table will indeed be able to work as 
expected with using an array column as a part of primary key :)


http://gerrit.cloudera.org:8080/#/c/23493/1/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/23493/1/testdata/datasets/functional/functional_schema_template.sql@4820
PS1, Line 4820:   array_DECIMAL ARRAY<DECIMAL(18,18)>
I'm curious: what drives the selection of array element types for this test 
scenario?  Is that about special cases in GetKuduArrayElementSize() 
implementation?

This looks good to me as-is, but I'd think of adding more columns to cover all 
the supported types, at least for the following:
 * columns of floating point arrays (FLOAT, DOUBLE)
 * STRING (or BINARY) arrays
 * BOOL arrays: there might be some special handling required to work with raw 
data returned by ScanBatch::RowPtr::direct_data() in case of BOOL types -- the 
elements come as bytes (not a single bit per element)



--
To view, visit http://gerrit.cloudera.org:8080/23493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9282aac821bd30668189f84b2ed8fff7047e7310
Gerrit-Change-Number: 23493
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Xuebin Su <[email protected]>
Gerrit-Comment-Date: Fri, 03 Oct 2025 23:30:24 +0000
Gerrit-HasComments: Yes

Reply via email to