[email protected] has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21031 )

Change subject: IMPALA-12709: Add support for hierarchical metastore event 
processing
......................................................................


Patch Set 27:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/21031/26//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21031/26//COMMIT_MSG@30
PS26, Line 30:    only after the alter database event is processed.
> Consider the scenario: a database event, and then there are a bunch of tabl
we cannot rename a db


http://gerrit.cloudera.org:8080/#/c/21031/26/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/21031/26/be/src/catalog/catalog-server.cc@217
PS26, Line 217: DEFINE_int32(hms_event_polling_interval_ms, 0,
> Can you add some description about why we need a ms interval when we alread
This flag is to set the polling interval of catalogd in milliseconds precision. 
It is not specific to enable_hierarchical_event_processing flag. But when 
enable_hierarchical_event_processing is true, we can set 
hms_event_polling_interval_ms value to < 1000(ms) because the event fetching 
thread just dispatches the events to appropriate event executors for 
processing(i.e., actual event processing do not happen on the event fetching 
thread). This allows the catalogd to catchup faster with HMS.


http://gerrit.cloudera.org:8080/#/c/21031/26/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/21031/26/be/src/common/global-flags.cc@294
PS26, Line 294:     "hms_event_polling_interval_ms is used to specify the 
polling interval in "
> Should we say here that 'hms_event_polling_interval_ms' is used by hierarch
hms_event_polling_interval_ms can be used to configure polling intercal in ms 
precision. It is not specific to enable_hierarchical_event_processing flag.


http://gerrit.cloudera.org:8080/#/c/21031/26/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21031/26/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@637
PS26, Line 637:       
processEventProcessorDebugAction(ddlRequest.getQuery_options().getDebug_action());
> Do we need a debug action when we have this: https://github.com/apache/impa
It is used in /tests/custom_cluster/test_event_processing_perf.py


http://gerrit.cloudera.org:8080/#/c/21031/26/fe/src/main/java/org/apache/impala/util/DebugUtils.java
File fe/src/main/java/org/apache/impala/util/DebugUtils.java:

http://gerrit.cloudera.org:8080/#/c/21031/26/fe/src/main/java/org/apache/impala/util/DebugUtils.java@99
PS26, Line 99:   public static final String EVENT_PROCESSOR = "event_processor";
> nit: EVENT_PROCESSOR_STATE sounds more intuitive to me.
We can set the event processor debug action like below:
event_processor:state@resume
event_processor:state@hold

So have not used state again in the action label as it becomes redundant. In 
future if we want to add any other actions to event_processor, we can extend 
with event_processor:new_action@action_value format.


http://gerrit.cloudera.org:8080/#/c/21031/26/fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
File 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java:

http://gerrit.cloudera.org:8080/#/c/21031/26/fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java@31
PS26, Line 31: pollingFrequencyInMilliSec
> Why change polling frequency to millisec when hierarchical event processor
JniCatalog.getEventsProcessor() reads the event polling interval 
configuration(hms_event_polling_interval_ms, hms_event_polling_interval_s) and 
pass the configured interval to MetastoreEventsProcessor.getInstance() static 
method. A new instance of MetastoreEventsProcessor is created with that 
interval. So have passed time in millisec directly. Since 
SynchronousHMSEventProcessorForTests extends  MetastoreEventsProcessor, have 
made the corresponding changes here.


http://gerrit.cloudera.org:8080/#/c/21031/26/tests/custom_cluster/test_event_processing_perf.py
File tests/custom_cluster/test_event_processing_perf.py:

http://gerrit.cloudera.org:8080/#/c/21031/26/tests/custom_cluster/test_event_processing_perf.py@310
PS26, Line 310:       start_event_id = self.__hold_event_processing()
> Why is this commented?
Had commented since tests take long time to run. I have uncommented it now.


http://gerrit.cloudera.org:8080/#/c/21031/26/tests/custom_cluster/test_event_processing_perf.py@352
PS26, Line 352:   def __drop_databases(self, exec_fn):
> Should compare the performance with and without a hierarchical event proces
These are time consuming tests. Have disabled the tests from running by 
default. Have added separate tests to run with and without hierarchical event 
processing enabled for different cases(external, transaction, non-partitioned, 
partitioned etc). At the end of the each test, have logged the execution time 
for each operation. Need to capture the time logs manually for each test.


http://gerrit.cloudera.org:8080/#/c/21031/26/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/21031/26/tests/custom_cluster/test_events_custom_configs.py@1345
PS26, Line 1345:
> Why do we need to set the log level to trace?
Removed it



--
To view, visit http://gerrit.cloudera.org:8080/21031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6
Gerrit-Change-Number: 21031
Gerrit-PatchSet: 27
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Comment-Date: Tue, 26 Nov 2024 12:31:49 +0000
Gerrit-HasComments: Yes

Reply via email to