Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20131 )

Change subject: IMPALA-12152: Add query options to wait for HMS events sync up
......................................................................


Patch Set 31:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/20131/30//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20131/30//COMMIT_MSG@37
PS30, Line 37: The wait time can
             : be set to the largest lag of event processing that has been 
observed in
             : the cluster.
> Done. Usually users have some charts to track the lag. It's shown in the ca
Done


http://gerrit.cloudera.org:8080/#/c/20131/30//COMMIT_MSG@51
PS30, Line 51:  Compilation: 1s321ms
> Maybe "Continuing without syncing Metastore events" is better. We don't kno
Ack


http://gerrit.cloudera.org:8080/#/c/20131/31//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20131/31//COMMIT_MSG@39
PS31, Line 39: catalogd WebUI and metrics
Is it events-processor.lag-time? Can you spell out the metrics name in commit 
message please?


http://gerrit.cloudera.org:8080/#/c/20131/31//COMMIT_MSG@84
PS31, Line 84: Currently, UPDATE_TBL_COL_STAT_EVENT, UPDATE_PART_COL_STAT_EVENT,
             : OPEN_TXN events are ignored by the event processor. If the 
latest event
             : happens to be in these types and there are no more other events, 
the
             : last synced event id can never reach the latest event id.
Question: Does this mean SYNC_HMS_EVENTS_WAIT_TIME_S will be spent in full 
waiting and fail?


http://gerrit.cloudera.org:8080/#/c/20131/30/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/20131/30/be/src/catalog/catalog-server.cc@465
PS30, Line 465:       status = catalog_server_->catalog()->WaitForHmsEvent(req, 
&resp);
> > 1. Not wait indefinitely?
OK. Can you put comment in ImpalaService.thrift or somewhere else that is 
relevant to not change SYNC_HMS_EVENTS_WAIT_TIME_S and 
SYNC_HMS_EVENTS_STRICT_MODE to non-default value cluster wide (ie., though 
--default_query_options)?

To me, this is a class of query option that might be detrimental to performance 
(and might be cascading) if set to arbitrary high default value, cluster wide. 
We should warn to be careful about it, even only through docs, until further 
code improvement is implemented.


http://gerrit.cloudera.org:8080/#/c/20131/31/fe/src/main/java/org/apache/impala/analysis/StatementBase.java
File fe/src/main/java/org/apache/impala/analysis/StatementBase.java:

http://gerrit.cloudera.org:8080/#/c/20131/31/fe/src/main/java/org/apache/impala/analysis/StatementBase.java@73
PS31, Line 73: public String getParsedDb()
Can we add @Nullable annotation here?
Some IDE can do static analysis and warn if access does not check for null 
possibility.


http://gerrit.cloudera.org:8080/#/c/20131/30/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/20131/30/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@4232
PS30, Line 4232: req.timeout_s * 1000L
> That's a fair point. The network latency might be trivial and can be ignore
Ack. Please file a follow up JIRA for that.


http://gerrit.cloudera.org:8080/#/c/20131/31/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20131/31/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java@46
PS31, Line 46: long getLastSyncedEventId();
Maybe add default value for new interface method here?


http://gerrit.cloudera.org:8080/#/c/20131/31/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20131/31/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1523
PS31, Line 1523: long timeoutMs, Reference<Boolean> success
nit: I think you should create a class two the returned String with success 
state.

I see we use apache.impala.common.Reference a lot in CatalogD code. The pattern 
make sense in C++, but feels convoluted in Java where we can easily create a 
container class and maybe reuse it somewhere else relevant.

For example, occurrence of "Reference<Boolean> tblWasRemoved, 
Reference<Boolean> dbWasAdded" deserve to have its own container class to 
emphasize why both are significant and exist together.


http://gerrit.cloudera.org:8080/#/c/20131/31/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1549
PS31, Line 1549: hmsEventSyncSleepIntervalMs_
Should this be min(timeoutMs, hmsEventSyncSleepIntervalMs_) ?


http://gerrit.cloudera.org:8080/#/c/20131/31/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/20131/31/fe/src/main/java/org/apache/impala/service/Frontend.java@2294
PS31, Line 2294: queryCtx.setWarnings(errorMsg);
Why we need to setWarnings(errorMsg) if errorMsg is already printed in Query 
timeline?

Why not set any warning during planning as an InfoString under FrontendProfile 
and let backend to grab it from ClientRequestState? (see also: 
ClientRequestState::SetFrontendProfile)


http://gerrit.cloudera.org:8080/#/c/20131/31/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

http://gerrit.cloudera.org:8080/#/c/20131/31/tests/common/impala_test_suite.py@383
PS31, Line 383: get_client
Rename to 'default_impala_client'.

It clarify the intent of using previously initialized client targeting Impala 
node vs creating a new client.

FYI, we also have hive_client inside ImpalaTestSuite, so 
'default_impala_client' will not be ambiguous.
It will be nice to converge the naming with this patch of mine: 
https://gerrit.cloudera.org/c/22336/


http://gerrit.cloudera.org:8080/#/c/20131/31/tests/common/impala_test_suite.py@1378
PS31, Line 1378: {"sync_hms_events_wait_time_s": timeout_s, 
"sync_hms_events_strict_mode": True}
Can define a dictionary constant and share this with wait_for_db_to_appear.


http://gerrit.cloudera.org:8080/#/c/20131/30/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/20131/30/tests/custom_cluster/test_events_custom_configs.py@1530
PS30, Line 1530:
> Done. I think this is a custom cluster test that runs serially so it's ok t
self.get_client(vector.get_value('protocol')) is good. Thanks.


http://gerrit.cloudera.org:8080/#/c/20131/30/tests/custom_cluster/test_events_custom_configs.py@1541
PS30, Line 1541:
> The query will fail immediately regardless of the wait time since catalogd
Ack


http://gerrit.cloudera.org:8080/#/c/20131/31/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/20131/31/tests/custom_cluster/test_events_custom_configs.py@1427
PS31, Line 1427:     handle = client.execute_async(query)
Close this handle before you run another execute_async at L1444.
Do note that HS2 client currently can not run 2 concurrent queries together at 
the same time.
https://gerrit.cloudera.org/c/22362/


http://gerrit.cloudera.org:8080/#/c/20131/31/tests/metadata/test_event_processing.py
File tests/metadata/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/20131/31/tests/metadata/test_event_processing.py@214
PS31, Line 214: class TestEventSyncWaiting(ImpalaTestSuite):
Override add_test_dimensions so you can define query option that you must have 
for this test.

  def add_test_dimensions(cls):
    super(TestEventSyncWaiting, cls).add_test_dimensions()
    cls.ImpalaTestMatrix.add_dimension(create_single_exec_option_dimension())
    add_mandatory_exec_option(cls, 'sync_hms_events_wait_time_s', 10)
    add_mandatory_exec_option(cls, 'sync_hms_events_strict_mode', True)

Thus, you dont need EVENT_SYNC_QUERY_OPTIONS anymore and simply use 
vector.get_exec_option_dict()


http://gerrit.cloudera.org:8080/#/c/20131/31/tests/metadata/test_event_processing_base.py
File tests/metadata/test_event_processing_base.py:

http://gerrit.cloudera.org:8080/#/c/20131/31/tests/metadata/test_event_processing_base.py@30
PS31, Line 30:   def _run_test_insert_events_impl(cls, hive_client, 
impala_client, impala_cluster,
             :       unique_database, is_transactional=False):
This method modify and clear existing configuration of impala_client.
To ensure correctness, please make sure impala_client is a new client and not 
one of default impala clients inside ImpalaTestSuite.


http://gerrit.cloudera.org:8080/#/c/20131/31/tests/metadata/test_event_processing_base.py@144
PS31, Line 144:   def _run_event_based_replication_tests_impl(cls, hive_client, 
impala_client,
              :       impala_cluster, filesystem_client, transactional=True):
This method modify and clear existing configuration of impala_client.
To ensure correctness, please make sure impala_client is a new client and not 
one of default impala clients inside ImpalaTestSuite.



--
To view, visit http://gerrit.cloudera.org:8080/20131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36ac941bb2c2217b09fcfa2eb567b011b38efa2a
Gerrit-Change-Number: 20131
Gerrit-PatchSet: 31
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Steve Carlin <[email protected]>
Gerrit-Comment-Date: Tue, 21 Jan 2025 18:59:29 +0000
Gerrit-HasComments: Yes

Reply via email to