Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21492 )
Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu ...................................................................... Patch Set 6: (15 comments) http://gerrit.cloudera.org:8080/#/c/21492/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21492/3//COMMIT_MSG@16 PS3, Line 16: All DMLs supported with Kudu tables are affected affected: : INSERT, UPSERT, UPDATE, DELETE : > Added planner tests. Ack. Is it worth to at least print WARNING in Planner.getExplainString() if those two option values are not equal? If yes, please add test to show such WARNING is displayed in query profile. http://gerrit.cloudera.org:8080/#/c/21492/6/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/21492/6/common/thrift/ImpalaService.thrift@883 PS6, Line 883: conversiyon nit: conversion http://gerrit.cloudera.org:8080/#/c/21492/3/fe/src/main/java/org/apache/impala/util/ExprUtil.java File fe/src/main/java/org/apache/impala/util/ExprUtil.java: http://gerrit.cloudera.org:8080/#/c/21492/3/fe/src/main/java/org/apache/impala/util/ExprUtil.java@126 PS3, Line 126: /** : * Wraps 'timestampExpr' in to_utc_timestamp() that converts > Done Done http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/util/ExprUtil.java File fe/src/main/java/org/apache/impala/util/ExprUtil.java: http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/util/ExprUtil.java@104 PS6, Line 104: Preconditions.checkArgument(timestampExpr.isConstant()); Is this Precondition and another one in L86 correct? I see some examples where utc_to_unix_micros() is not wrapping constant expression. workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test:435:| runtime filters: RF000[bloom] <- c.timestamp_col, RF001[bloom] <- utc_to_unix_micros(c.timestamp_col) workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test:506:| runtime filters: RF000[bloom] <- c.timestamp_col, RF001[bloom] <- utc_to_unix_micros(c.timestamp_col) workloads/functional-query/queries/QueryTest/kudu_runtime_filter_with_timestamp_conversion.test:72:row_regex: .*RF00.\[bloom\] <- utc_to_unix_micros\(to_utc_timestamp\(from_utc_timestamp\(t2.ts, 'Asia/Shanghai'\), 'Asia/Shanghai'\)\).* http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@753 PS6, Line 753: kudu-dml-with-utc-conversion Do you mind adding 1 SELECT test where CONVERT_KUDU_UTC_TIMESTAMPS affect plan? I don't see 'utc_to_unix_micros' anywhere in test file. http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv File testdata/datasets/functional/schema_constraints.csv: http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv@425 PS6, Line 425: # The table is used to test DST changes in timestamps, the timestamps in the table near : # DST changes in the 'America/Los_Angeles' time zone. Please add in comment: "In 2011 Los Angeles, DST starts at 2 a.m. on Sunday, March 13 (clocks were turned forward 1 hour) and fall back to standard time again at 2 a.m. on Sunday, November 6 (clocks were turned backward 1 hour)." https://www.timeanddate.com/time/change/usa?year=2011. http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test: http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test@75 PS6, Line 75: Is this whitespace a defect from getExplainString() ? http://gerrit.cloudera.org:8080/#/c/21492/5/testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test File testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test: http://gerrit.cloudera.org:8080/#/c/21492/5/testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test@20 PS5, Line 20: INT question: This was BIGINT before and it does not break test when timestamp_at_dst_changes.id type is INT. Why is that? http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test File testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test: http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@31 PS6, Line 31: tkey timestamp primary key, t timestamp, nit: maybe following name is better for clarity: ts_pk_col timestamp primary key, ts_col timestamp, http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@66 PS6, Line 66: select * from utc_kudu where id > 10; Add "order by id" for readability? http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@79 PS6, Line 79: select * from utc_kudu where id > 10; Add "order by id" for readability? http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@129 PS6, Line 129: select count(*) from utc_kudu where tkey = t; Do you mind changing this from select count star to select star for clarity? http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@138 PS6, Line 138: select * from utc_kudu where id > 10; Add "order by id" for readability? http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@149 PS6, Line 149: select * from utc_kudu where id > 10; Add "order by id" for readability? http://gerrit.cloudera.org:8080/#/c/21492/6/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: http://gerrit.cloudera.org:8080/#/c/21492/6/tests/query_test/test_kudu.py@108 PS6, Line 108: cls.ImpalaTestMatrix.add_mandatory_exec_option('write_kudu_utc_timestamps', 'true') There is another query option that control timestamp conversion: use_local_tz_for_unix_timestamp_conversions. It will affect result of from_unixtime(). Can you explicitly set that to false (default value) for this test? -- To view, visit http://gerrit.cloudera.org:8080/21492 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1 Gerrit-Change-Number: 21492 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Ashwani Raina <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Zihao Ye <[email protected]> Gerrit-Comment-Date: Fri, 14 Jun 2024 17:47:43 +0000 Gerrit-HasComments: Yes
