Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23487 )

Change subject: IMPALA-14131: Add flag to configure the default value of 
'impala.disableHmsSync'
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/23487/5/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/23487/5/be/src/catalog/catalog-server.cc@323
PS5, Line 323: all objects
CREATE/DROP/ALTER_DATABASE events are still processed, right? Otherwise how 
could catalogd detect that the DB level property was changed.


http://gerrit.cloudera.org:8080/#/c/23487/5/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/23487/5/tests/custom_cluster/test_events_custom_configs.py@1848
PS5, Line 1848:     self.client.execute("invalidate metadata {}".format(tbl1))
This is not a new behavior, but what happens with loaded tables in a database 
when 'impala.disableHmsSync' is set to 'false' for the database? Do we 
invalidate them (as they may have skipped events) or this is something that the 
user has to do manually?


http://gerrit.cloudera.org:8080/#/c/23487/5/tests/custom_cluster/test_events_custom_configs.py@1849
PS5, Line 1849:     self.client.execute("invalidate metadata {}".format(tbl2))
              :     self.run_stmt_in_hive(
              :       """insert into {tb1} partition(year=2025) values(1),(2);
              :          insert into {tb2} values(1),(2),(3);"""
              :       .format(tb1=tbl1, tb2=tbl2))
              :     EventProcessorUtils.wait_for_event_processing(self)
              :     tb1_data = self.client.execute("select * from 
{}".format(tbl1))
              :     assert len(tb1_data.data) == 2
              :     # disable hms sync at database leve
This doesn't seem to validate that Impala actually detects the events, as the 
tables are invalidated first so the select will lead to a fresh metadata load.

Another issue is that both the partitioned and the unpartitioned tables are 
written, but only the partitioned is checked.

These kind of tests are both verbose and error prone - maybe some 
functions/classes could be used to avoid these issues. E.g. 
check_insert_event(tlb_name, expected_rows, values= "(1)",part=None) where part 
could be set for partitioned tables. Or same functions with list arguments that 
does this for multiple tables at the same time to avoid waiting for event 
processing multiple times.



--
To view, visit http://gerrit.cloudera.org:8080/23487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4ee617aed48575502d9cf5cf2cbea6ec897d6839
Gerrit-Change-Number: 23487
Gerrit-PatchSet: 5
Gerrit-Owner: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Comment-Date: Tue, 18 Nov 2025 11:39:24 +0000
Gerrit-HasComments: Yes

Reply via email to