Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21031 )

Change subject: IMPALA-12709: Add support for hierarchical metastore event 
processing
......................................................................


Patch Set 36:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/21031/27//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21031/27//COMMIT_MSG@72
PS27, Line 72:    synchronize the source and target TableProcessors to
> Have fixed the FE tests changes.
Cool!


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@996
PS36, Line 996:   public static class PseudoCommitTxnEvent extends 
MetastoreTableEvent
Just a thought: our code can be cleaner if we can resolve IMPALA-13285 first 
and move CommitTxnEvent back to MetastoreEvents.java. Then this can be there as 
well.


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@1095
PS36, Line 1095:             event.getCatalogOpExecutor().getCatalog()
               :                 .getTable(entry.getKey().getDb(), 
entry.getKey().getTbl())
               :                 .getMetaStoreTable()
The msTbl object in catalog might be stale. Shouldn't we get it from the HMS 
event like L1069?


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java
File fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java:

http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java@54
PS36, Line 54: DBBarrierEvent
nit: we tend to use "Db" in names, i.e. "DbBarrierEvent". E.g. isBlacklistedDb, 
eventDb, HdfsOrcScanner (for ORC), etc.


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java
File fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@149
PS36, Line 149: dbEvents_
nit: maybe "barrierEvents_" is clearer. At the beginning, I thought it's all 
the db events.


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@226
PS36, Line 226: /*
nit: we prefer only using C-style comments /* */ in class or method comments, 
not in codes inside the methods.


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@248
PS36, Line 248:           if (!tableProcessor.isEmpty())
              :             tableProcessor.enqueue(barrierEvent);
nit: put them in one line or add {}


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@374
PS36, Line 374:           dispatchTableEvent(event);
It seems we should also skip non-rename events if the CREATE_DATABASE event is 
skipped at L371. E.g. for the following events:

create database tmp;
create table tmp.tbl;
insert tmp.tbl;
alter table tmp.tbl rename to prod.tbl;
drop database tmp;

If we skip the first CREATE_DATABASE event, will processing the CREATE_TABLE 
event fail due to db not exists?


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@705
PS36, Line 705: checkState
nit: checkArgument() might be better since it throws IllegalArgumentException 
in case users set an invalid value.

      Preconditions.checkArgument(numDbEventExecutor > 0,
          "num_db_event_executors should be positive: %s", numDbEventExecutor);


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1273
PS36, Line 1273:            And lastSyncedEventTimeSecs_ accounts only for 
event dispatched time in
               :            hierarchical mode.
Ideally we can maintain lastSyncedEventId as min(event_id of all unprocessed 
events) - 1, and lastSyncedEventTimeSecs_ as the event_time of it. Please file 
a follow-up JIRA for this.


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java
File fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@78
PS36, Line 78:
nit: just needs one space indention


http://gerrit.cloudera.org:8080/#/c/21031/36/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@158
PS36, Line 158:       } else if (event instanceof DropTableEvent) {
Should we also consider the DropTable kind of RenameTableBarrierEvent?



--
To view, visit http://gerrit.cloudera.org:8080/21031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6
Gerrit-Change-Number: 21031
Gerrit-PatchSet: 36
Gerrit-Owner: Anonymous Coward <k.venureddy2...@gmail.com>
Gerrit-Reviewer: Anonymous Coward <cclive1...@gmail.com>
Gerrit-Reviewer: Anonymous Coward <k.venureddy2...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Comment-Date: Wed, 12 Feb 2025 00:36:58 +0000
Gerrit-HasComments: Yes

Reply via email to