Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21031 )
Change subject: IMPALA-12709: Add support for hierarchical metastore event processing ...................................................................... Patch Set 36: (12 comments) http://gerrit.cloudera.org:8080/#/c/21031/27//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21031/27//COMMIT_MSG@72 PS27, Line 72: synchronize the source and target TableProcessors to > Have fixed the FE tests changes. Cool! http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@996 PS36, Line 996: public static class PseudoCommitTxnEvent extends MetastoreTableEvent Just a thought: our code can be cleaner if we can resolve IMPALA-13285 first and move CommitTxnEvent back to MetastoreEvents.java. Then this can be there as well. http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@1095 PS36, Line 1095: event.getCatalogOpExecutor().getCatalog() : .getTable(entry.getKey().getDb(), entry.getKey().getTbl()) : .getMetaStoreTable() The msTbl object in catalog might be stale. Shouldn't we get it from the HMS event like L1069? http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java File fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java: http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java@54 PS36, Line 54: DBBarrierEvent nit: we tend to use "Db" in names, i.e. "DbBarrierEvent". E.g. isBlacklistedDb, eventDb, HdfsOrcScanner (for ORC), etc. http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java File fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java: http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@149 PS36, Line 149: dbEvents_ nit: maybe "barrierEvents_" is clearer. At the beginning, I thought it's all the db events. http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@226 PS36, Line 226: /* nit: we prefer only using C-style comments /* */ in class or method comments, not in codes inside the methods. http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@248 PS36, Line 248: if (!tableProcessor.isEmpty()) : tableProcessor.enqueue(barrierEvent); nit: put them in one line or add {} http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@374 PS36, Line 374: dispatchTableEvent(event); It seems we should also skip non-rename events if the CREATE_DATABASE event is skipped at L371. E.g. for the following events: create database tmp; create table tmp.tbl; insert tmp.tbl; alter table tmp.tbl rename to prod.tbl; drop database tmp; If we skip the first CREATE_DATABASE event, will processing the CREATE_TABLE event fail due to db not exists? http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@705 PS36, Line 705: checkState nit: checkArgument() might be better since it throws IllegalArgumentException in case users set an invalid value. Preconditions.checkArgument(numDbEventExecutor > 0, "num_db_event_executors should be positive: %s", numDbEventExecutor); http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1273 PS36, Line 1273: And lastSyncedEventTimeSecs_ accounts only for event dispatched time in : hierarchical mode. Ideally we can maintain lastSyncedEventId as min(event_id of all unprocessed events) - 1, and lastSyncedEventTimeSecs_ as the event_time of it. Please file a follow-up JIRA for this. http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java File fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java: http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@78 PS36, Line 78: nit: just needs one space indention http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@158 PS36, Line 158: } else if (event instanceof DropTableEvent) { Should we also consider the DropTable kind of RenameTableBarrierEvent? -- To view, visit http://gerrit.cloudera.org:8080/21031 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6 Gerrit-Change-Number: 21031 Gerrit-PatchSet: 36 Gerrit-Owner: Anonymous Coward <k.venureddy2...@gmail.com> Gerrit-Reviewer: Anonymous Coward <cclive1...@gmail.com> Gerrit-Reviewer: Anonymous Coward <k.venureddy2...@gmail.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com> Gerrit-Comment-Date: Wed, 12 Feb 2025 00:36:58 +0000 Gerrit-HasComments: Yes