Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/23487 )
Change subject: IMPALA-14131: Add flag to configure the default value of 'impala.disableHmsSync' ...................................................................... Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/23487/5/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/23487/5/be/src/catalog/catalog-server.cc@323 PS5, Line 323: all objects CREATE/DROP/ALTER_DATABASE events are still processed, right? Otherwise how could catalogd detect that the DB level property was changed. http://gerrit.cloudera.org:8080/#/c/23487/5/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/23487/5/tests/custom_cluster/test_events_custom_configs.py@1848 PS5, Line 1848: self.client.execute("invalidate metadata {}".format(tbl1)) This is not a new behavior, but what happens with loaded tables in a database when 'impala.disableHmsSync' is set to 'false' for the database? Do we invalidate them (as they may have skipped events) or this is something that the user has to do manually? http://gerrit.cloudera.org:8080/#/c/23487/5/tests/custom_cluster/test_events_custom_configs.py@1849 PS5, Line 1849: self.client.execute("invalidate metadata {}".format(tbl2)) : self.run_stmt_in_hive( : """insert into {tb1} partition(year=2025) values(1),(2); : insert into {tb2} values(1),(2),(3);""" : .format(tb1=tbl1, tb2=tbl2)) : EventProcessorUtils.wait_for_event_processing(self) : tb1_data = self.client.execute("select * from {}".format(tbl1)) : assert len(tb1_data.data) == 2 : # disable hms sync at database leve This doesn't seem to validate that Impala actually detects the events, as the tables are invalidated first so the select will lead to a fresh metadata load. Another issue is that both the partitioned and the unpartitioned tables are written, but only the partitioned is checked. These kind of tests are both verbose and error prone - maybe some functions/classes could be used to avoid these issues. E.g. check_insert_event(tlb_name, expected_rows, values= "(1)",part=None) where part could be set for partitioned tables. Or same functions with list arguments that does this for multiple tables at the same time to avoid waiting for event processing multiple times. -- To view, visit http://gerrit.cloudera.org:8080/23487 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4ee617aed48575502d9cf5cf2cbea6ec897d6839 Gerrit-Change-Number: 23487 Gerrit-PatchSet: 5 Gerrit-Owner: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Comment-Date: Tue, 18 Nov 2025 11:39:24 +0000 Gerrit-HasComments: Yes
