Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23159 )

Change subject: IMPALA-14082: Support batch processing of RELOAD events on same 
table
......................................................................


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/23159/6/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/23159/6/tests/custom_cluster/test_events_custom_configs.py@578
PS6, Line 578:
> > reload events don't rely on selfevent() (instead they rely on isOlderEven
This config is introduced to address the test failure associated with this 
query:
check_self_events(
        "refresh {}.{} partition(year=2022) partition(year=2023) 
partition(year=2023)"
        .format(unique_database, test_reload_table), 2)

Without this patch: self events from the above query are detected at 
MetastoreEvents#ReloadEvent#processTableEvent() method, where we check if the 
current event is older than lastRefreshEventId, if so we increment the 
self-events-skipped counter.

With this patch: since the 2 reload events are batched together, we end up at 
MetastoreEvents#BatchPartitionEvent#processTableEvent() where we first check 
for self-event (reload events don't have self-event evaluation) and then check 
for olderEvent(), this method checks if the config 
'enable_skipping_older_events' is enabled, and if it is not enabled, then the 
event is processed and hence the above check_self_events() fails because we 
couldn't detect any self-events.


http://gerrit.cloudera.org:8080/#/c/23159/6/tests/custom_cluster/test_events_custom_configs.py@647
PS6, Line 647:       req.partitionVals = ["2022"]
> So the purpose of this test is not matched anymore since no events are skip
Ack


http://gerrit.cloudera.org:8080/#/c/23159/7/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/23159/7/tests/custom_cluster/test_events_custom_configs.py@631
PS7, Line 631:       # Test to verify if older events are being skipped in 
event processor
> Let's add a comment to explain how the test works. IIUC, we fire 10 consecu
This variable name is deceptive 
(https://github.com/apache/impala/commit/32b29ff36fb3e05fd620a6714de88805052d0117#diff-aa3676424d9768430ba864ec080fe180c63c0a48406057c1568cc5691ef015dd);
 let me rename it to make it sound more intuitive.
This is essentially a scenario to fire reload events from Hive.


http://gerrit.cloudera.org:8080/#/c/23159/7/tests/custom_cluster/test_events_custom_configs.py@808
PS7, Line 808:     self.client.execute("create table {} (i int) partitioned 
by(p int)".format(tbl))
> Let's create the table as text format so we can read the new files correctl
Ack


http://gerrit.cloudera.org:8080/#/c/23159/7/tests/custom_cluster/test_events_custom_configs.py@818
PS7, Line 818: 000
> Let's use a different value other than 0 so we can verify the content as we
Ack



--
To view, visit http://gerrit.cloudera.org:8080/23159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3e9a99b666a1c928ac2a136bded1e5420f77dab
Gerrit-Change-Number: 23159
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Comment-Date: Wed, 10 Sep 2025 04:20:39 +0000
Gerrit-HasComments: Yes

Reply via email to