Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20533 )

Change subject: IMPALA-12463: Batch non-consecutive table events in the event 
processor
......................................................................


Patch Set 7:

(4 comments)

I looped the test that was failing for a few hours (a few hundred iterations) 
and I wasn't able to reproduce the test failure. The logs show that there is no 
batching involved for that test. For that test, all of the events are on a 
single table. If there is no batching, then the sorting means that everything 
stays in the same order in came in.

Rebased and adapted the new test cases to the changes in IMPALA-10949.

http://gerrit.cloudera.org:8080/#/c/20533/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/20533/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@400
PS7, Line 400:         } else if (next instanceof AlterTableEvent) {
> Shouldn't we consider create/drop table events as batch-breaking events?
There is already batch-breaking logic via the canBeBatched() method. Only 
InsertEvent and AlterPartitionEvent implement it in a way that can return true. 
As long as events for an individual table hash to the same place, the existing 
batch-breaking logic will apply on the sequence of events for that table. So, 
create/drop table behaves the same way it does today, breaking any existing 
batch for the table involved.

This section is meant to handle additional batch breaking that the regular 
batch breaking may not handle, e.g. events that apply across multiple tables or 
across a table that would not be the one we hash to.


http://gerrit.cloudera.org:8080/#/c/20533/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@405
PS7, Line 405:             flushBatchForTable(pendingTableEventsMap, 
sortedFinalBatches, beforeTable);
> IMO, we should also consider the scenario where the rename table happens ac
beforeTable and afterTable are org.apache.hadoop.hive.metastore.api.Table 
objects with the both database and table included, so they are treated as fully 
qualified table names.

Was there additional flushing that you had in mind?


http://gerrit.cloudera.org:8080/#/c/20533/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@408
PS7, Line 408:             // an invalid scenario, because the destination must 
not exist at the time
> This is a possible scenario, In the current queue, There are table events f
There definitely needs to be batch breaking, but the combination of the 
preexisting batch-breaking logic and the fact that events are sorted by end 
Event ID makes it hard to construct a valid sequence that isn't already handled 
by other code.

To take your example, the drop event on t1 breaks the t1 batch. The alter table 
rename hashes to the start table t2 and breaks the t2 batch.

I still think it is useful to make the handling explicit to be safe.

I will reword the comment to make this clearer.


http://gerrit.cloudera.org:8080/#/c/20533/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@426
PS7, Line 426:           dbMap = pendingTableEventsMap.get(dbName);
> Shouldn't we just assign a new HashMap<>() directly to the dbMap variable?
Good point, fixed



--
To view, visit http://gerrit.cloudera.org:8080/20533
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905
Gerrit-Change-Number: 20533
Gerrit-PatchSet: 7
Gerrit-Owner: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Mon, 18 Dec 2023 07:34:14 +0000
Gerrit-HasComments: Yes

Reply via email to