Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/21663 )
Change subject: IMPALA-13126: Obtain table read lock in EP to process partitioned event ...................................................................... Patch Set 6: (6 comments) http://gerrit.cloudera.org:8080/#/c/21663/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21663/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1300 PS6, Line 1300: partitionEventObj Do we need to take the lock if partitionEventObj is null? http://gerrit.cloudera.org:8080/#/c/21663/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1811 PS6, Line 1811: if (isOlderEvent(null)) { : infoLog("Not processing the alter table event {} as it is an older event", : getEventId()); : return; : } : : // Determine whether this is an event which we have already seen or if it is a new : // event : if (isSelfEvent()) { : infoLog("Not processing the event as it is a self-event"); : return; : } : // Ignore the event if this is a trivial event. See javadoc for : // canBeSkipped() for examples. : if (canBeSkipped()) { The change could make the event processor "more blocking" by waiting for the table lock in cases where it was not necessary before. The first "skip check" is isOlderEvent(), which was very cheap before the change, but now can block if there is a parallel operation on the table. Meanwhile canBeSkipped() does not need the lock + isSelfEvent() also doesn't need the lock if the event is table level. It would be better to reorder the checks to do the cheapest ones first. http://gerrit.cloudera.org:8080/#/c/21663/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@3141 PS6, Line 3141: reloadPartition_ Do we need to take the lock if reloadPartition_ is null? http://gerrit.cloudera.org:8080/#/c/21663/6/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/21663/6/tests/custom_cluster/test_events_custom_configs.py@1334 PS6, Line 1334: @pytest.mark.execute_serially could be added as this modifies a shared table http://gerrit.cloudera.org:8080/#/c/21663/6/tests/custom_cluster/test_events_custom_configs.py@1348 PS6, Line 1348: if not exists Is it possible for the partition to exist at this point? The test could be made stricter by removing "if not exists" http://gerrit.cloudera.org:8080/#/c/21663/6/tests/custom_cluster/test_events_custom_configs.py@1347 PS6, Line 1347: self.client.execute( : "alter table {} add if not exists partition(j=-1)".format(tbl)) : self.client.execute( : "alter table {} drop partition(j=-1)".format(tbl)) Can't these statements lead to throwing an exception? This would lead to not running line 1357 and leaving the extra partition in the shared table -- To view, visit http://gerrit.cloudera.org:8080/21663 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I26933f98556736f66df986f9440ebb64be395bc1 Gerrit-Change-Number: 21663 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Comment-Date: Mon, 06 Jan 2025 15:09:32 +0000 Gerrit-HasComments: Yes
