Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
......................................................................


Patch Set 6:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/21492/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21492/3//COMMIT_MSG@16
PS3, Line 16: All DMLs supported with Kudu tables are affected affected:
            : INSERT, UPSERT, UPDATE, DELETE
            :
> Added planner tests.
Ack.
Is it worth to at least print WARNING in Planner.getExplainString() if those 
two option values are not equal?

If yes, please add test to show such WARNING is displayed in query profile.


http://gerrit.cloudera.org:8080/#/c/21492/6/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/21492/6/common/thrift/ImpalaService.thrift@883
PS6, Line 883: conversiyon
nit: conversion


http://gerrit.cloudera.org:8080/#/c/21492/3/fe/src/main/java/org/apache/impala/util/ExprUtil.java
File fe/src/main/java/org/apache/impala/util/ExprUtil.java:

http://gerrit.cloudera.org:8080/#/c/21492/3/fe/src/main/java/org/apache/impala/util/ExprUtil.java@126
PS3, Line 126:   /**
             :    * Wraps 'timestampExpr' in to_utc_timestamp() that converts
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/util/ExprUtil.java
File fe/src/main/java/org/apache/impala/util/ExprUtil.java:

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/util/ExprUtil.java@104
PS6, Line 104: Preconditions.checkArgument(timestampExpr.isConstant());
Is this Precondition and another one in L86 correct? I see some examples where 
utc_to_unix_micros() is not wrapping constant expression.

workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test:435:|
  runtime filters: RF000[bloom] <- c.timestamp_col, RF001[bloom] <- 
utc_to_unix_micros(c.timestamp_col)
workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test:506:|
  runtime filters: RF000[bloom] <- c.timestamp_col, RF001[bloom] <- 
utc_to_unix_micros(c.timestamp_col)
workloads/functional-query/queries/QueryTest/kudu_runtime_filter_with_timestamp_conversion.test:72:row_regex:
 .*RF00.\[bloom\] <- 
utc_to_unix_micros\(to_utc_timestamp\(from_utc_timestamp\(t2.ts, 
'Asia/Shanghai'\), 'Asia/Shanghai'\)\).*


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@753
PS6, Line 753: kudu-dml-with-utc-conversion
Do you mind adding 1 SELECT test where CONVERT_KUDU_UTC_TIMESTAMPS affect plan? 
I don't see 'utc_to_unix_micros' anywhere in test file.


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv
File testdata/datasets/functional/schema_constraints.csv:

http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv@425
PS6, Line 425: # The table is used to test DST changes in timestamps, the 
timestamps in the table near
             : # DST changes in the 'America/Los_Angeles' time zone.
Please add in comment:

"In 2011 Los Angeles, DST starts at 2 a.m. on Sunday, March 13 (clocks were 
turned forward 1 hour) and fall back to standard time again at 2 a.m. on 
Sunday, November 6 (clocks were turned backward 1 hour)."

https://www.timeanddate.com/time/change/usa?year=2011.


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test:

http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test@75
PS6, Line 75:
Is this whitespace a defect from getExplainString() ?


http://gerrit.cloudera.org:8080/#/c/21492/5/testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test:

http://gerrit.cloudera.org:8080/#/c/21492/5/testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test@20
PS5, Line 20: INT
question: This was BIGINT before and it does not break test when 
timestamp_at_dst_changes.id type is INT. Why is that?


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test:

http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@31
PS6, Line 31: tkey timestamp primary key, t timestamp,
nit: maybe following name is better for clarity:

ts_pk_col timestamp primary key, ts_col timestamp,


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@66
PS6, Line 66: select * from utc_kudu where id > 10;
Add "order by id" for readability?


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@79
PS6, Line 79: select * from utc_kudu where id > 10;
Add "order by id" for readability?


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@129
PS6, Line 129: select count(*) from utc_kudu where tkey = t;
Do you mind changing this from select count star to select star for clarity?


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@138
PS6, Line 138: select * from utc_kudu where id > 10;
Add "order by id" for readability?


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@149
PS6, Line 149: select * from utc_kudu where id > 10;
Add "order by id" for readability?


http://gerrit.cloudera.org:8080/#/c/21492/6/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/21492/6/tests/query_test/test_kudu.py@108
PS6, Line 108: 
cls.ImpalaTestMatrix.add_mandatory_exec_option('write_kudu_utc_timestamps', 
'true')
There is another query option that control timestamp conversion: 
use_local_tz_for_unix_timestamp_conversions. It will affect result of 
from_unixtime().
Can you explicitly set that to false (default value) for this test?



--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Zihao Ye <[email protected]>
Gerrit-Comment-Date: Fri, 14 Jun 2024 17:47:43 +0000
Gerrit-HasComments: Yes

Reply via email to