Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/23804 )
Change subject: IMPALA-13863: Add metric to track number of loaded tables in catalogd ...................................................................... Patch Set 1: (9 comments) http://gerrit.cloudera.org:8080/#/c/23804/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/23804/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2823 PS1, Line 2823: // Update loaded tables counter based on table state transitions. : if (existingTbl != null) { : boolean wasLoaded = !(existingTbl instanceof IncompleteTable); : boolean isLoaded = !(table instanceof IncompleteTable); : : if (wasLoaded && !isLoaded) { : numLoadedTables_.decrementAndGet(); : LOG.trace("Replaced loaded table {} with IncompleteTable, " + : "loaded tables count: {}", : table.getFullName(), numLoadedTables_.get()); : } else if (!wasLoaded && isLoaded) { : numLoadedTables_.incrementAndGet(); : LOG.trace("Replaced IncompleteTable with loaded table {}, " + : "loaded tables count: {}", : table.getFullName(), numLoadedTables_.get()); : } : } else if (!(table instanceof IncompleteTable)) { : // New table being added and it's loaded : numLoadedTables_.incrementAndGet(); : LOG.trace("Added new loaded table {}, loaded tables count: {}", : table.getFullName(), numLoadedTables_.get()); : } I see we are adding codes like this before all the call sites of Db#addTable(). It has a high risk of omission when other developers adding new codes that use Db#addTable(). It'd be better to move this code into Db#addTable(). The right place might be in CatalogObjectCache#add() which is a synchronized method. We can extend this method to add an optional argument for an update function which checks the existing and new items and update the counter. public synchronized boolean add(T catalogObject, BiConsumer<T, T> updateFunc) { CatalogObjectCache#remove() also needs an update function. http://gerrit.cloudera.org:8080/#/c/23804/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/23804/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@854 PS1, Line 854: removeTable I think we should update the counter here if tblToBeRemoved is a loaded table. One scenario is loading a table in Impala and then drop it in Hive. Processing the DROP_TABLE event uses this method. http://gerrit.cloudera.org:8080/#/c/23804/1/tests/custom_cluster/test_automatic_invalidation.py File tests/custom_cluster/test_automatic_invalidation.py: http://gerrit.cloudera.org:8080/#/c/23804/1/tests/custom_cluster/test_automatic_invalidation.py@105 PS1, Line 105: def _get_metric_value(self, metric_name): : """Helper to get a metric value from catalogd metrics endpoint.""" : import json : import requests : url = "http://localhost:25020/metrics?json" : response = requests.get(url) : metrics_json = json.loads(response.text) : for metric in metrics_json["metric_group"]["metrics"]: : if metric["name"] == metric_name: : return int(metric["value"]) : return None Can we use the existing get_metric_value() method? https://github.com/apache/impala/blob/85d77b908b12ae3d3f48ed5d49f38fb3832edc4e/tests/common/impala_service.py#L101-L114 e.g. self.cluster.catalogd.service.get_metric_value(metric_name) like https://github.com/apache/impala/blob/85d77b908b12ae3d3f48ed5d49f38fb3832edc4e/tests/custom_cluster/test_custom_statestore.py#L230-L231 http://gerrit.cloudera.org:8080/#/c/23804/1/tests/custom_cluster/test_automatic_invalidation.py@122 PS1, Line 122: and removal.""" There are some scenarios that worth testing: * Reload a loaded table, e.g. REFRESH, ALTER statements. * Rename a loaded table. http://gerrit.cloudera.org:8080/#/c/23804/1/tests/custom_cluster/test_automatic_invalidation.py@125 PS1, Line 125: time.sleep(2) Why do we need this sleep? http://gerrit.cloudera.org:8080/#/c/23804/1/tests/custom_cluster/test_automatic_invalidation.py@156 PS1, Line 156: assert count_after_global_invalidate < count_with_table, ( Shouldn't we expect count_after_global_invalidate == 0 and count_with_table == 1? http://gerrit.cloudera.org:8080/#/c/23804/1/tests/custom_cluster/test_automatic_invalidation.py@164 PS1, Line 164: count_after_create = self._get_metric_value("catalog.num-loaded-tables") Can we assert count_after_create == 0? http://gerrit.cloudera.org:8080/#/c/23804/1/tests/custom_cluster/test_automatic_invalidation.py@207 PS1, Line 207: assert count_before_timeout >= 1, ( Can we assert count_before_timeout == 1, or what other tables could be loaded at this point? http://gerrit.cloudera.org:8080/#/c/23804/1/tests/custom_cluster/test_automatic_invalidation.py@220 PS1, Line 220: assert count_after_timeout < count_before_timeout, ( Can we assert count_after_timeout == 0? -- To view, visit http://gerrit.cloudera.org:8080/23804 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5aa54f9f7507709b654df22e24592799811e8b6c Gerrit-Change-Number: 23804 Gerrit-PatchSet: 1 Gerrit-Owner: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Pranav Lodha <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Thu, 25 Dec 2025 13:29:12 +0000 Gerrit-HasComments: Yes
