Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: allow converting timestamps to UTC when writing 
Kudu
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21492/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21492/2//COMMIT_MSG@16
PS2, Line 16: To be able to read back Kudu tables written by Impala correctly
            : convert_kudu_utc_timestamps and write_kudu_utc_timestamps need to
            : have the same value.
> +1
The reason for adding a new flag (and not changing the name of the old one) is 
backward compatibility. See discussion at IMPALA-12322


http://gerrit.cloudera.org:8080/#/c/21492/2/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/21492/2/common/thrift/ImpalaService.thrift@882
PS2, Line 882: When true, UNIXTIME_MICRO columns read from Kudu will be 
interpreted as UTC an
> To be complete:
TIMESTAMP vs UNIXTIME_MICROS: Impala uses TIMESTAMP as SQL type (and does not 
support DATETIME). Impala TIMESTAMPs have no timezone associated with 
individual values (or individual columns), only a "global" (per query) timezone 
that affects some conversions.

Modified the comment, I hope that is is more precise now.


http://gerrit.cloudera.org:8080/#/c/21492/2/common/thrift/ImpalaService.thrift@944
PS2, Line 944:
> Ditto: please consider updating this similar to the comment for CONVERT_KUD
Modified the comment.


http://gerrit.cloudera.org:8080/#/c/21492/2/common/thrift/ImpalaService.thrift@946
PS2, Line 946: // written to Kudu as UNI
> Make both the query names similar. For read, it starts with CONVERT_ and fo
I agree that the current names are not too good, but 
"CONVERT_KUDU_UTC_TIMESTAMPS" is already released, so changing the name or the 
semantics could break existing workloads.


http://gerrit.cloudera.org:8080/#/c/21492/2/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test:

http://gerrit.cloudera.org:8080/#/c/21492/2/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@a3
PS2, Line 3:
> what's 'kutu'?
Done


http://gerrit.cloudera.org:8080/#/c/21492/2/tests/common/test_result_verifier.py
File tests/common/test_result_verifier.py:

http://gerrit.cloudera.org:8080/#/c/21492/2/tests/common/test_result_verifier.py@427
PS2, Line 427:     if file_format == 'avro' and 'TIMESTAMP' in expected_types:
> I don't know much about the context here, but Kudu still doesn't support TI
This is about verifying the results in our .test files. TIMESTAMP refers to the 
Impala data type returned by the query and file_format is about our test vector 
(we try to run the same queries with different file/table formats). e.g. this 
would also affect select cast(string_col as timestamp) from kudu_tbl; which 
doesn't even use UNIXTIME_MICRO.

So this looks like an old hack to me to skip some tests Kudu/Avro in .test 
files while running the others. It was a total surprise when I realized that 
the results of the tests I wrote is actually not checked.

This could still make sense for tests that use nanosecond timestamps as those 
are not supported in Kudu, but there are better tools to skip those tests.



--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zihao Ye <[email protected]>
Gerrit-Comment-Date: Mon, 10 Jun 2024 14:12:54 +0000
Gerrit-HasComments: Yes

Reply via email to