Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21437 )

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when partition list is stale
......................................................................


Patch Set 17:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/21437/17//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21437/17//COMMIT_MSG@18
PS17, Line 18: the partitions
nit: we are not loading all the partitions. To be specifit, "the target 
partitions" might be better.


http://gerrit.cloudera.org:8080/#/c/21437/17//COMMIT_MSG@19
PS17, Line 19: ofevent
nit: need a space


http://gerrit.cloudera.org:8080/#/c/21437/17/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21437/17/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1243
PS17, Line 1243: .setIsPreLoadForInsert(false)
nit: let's keep one method call in one line since they are more like an 
statement


http://gerrit.cloudera.org:8080/#/c/21437/17/fe/src/main/java/org/apache/impala/catalog/HdfsTableLoadParams.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTableLoadParams.java:

http://gerrit.cloudera.org:8080/#/c/21437/17/fe/src/main/java/org/apache/impala/catalog/HdfsTableLoadParams.java@76
PS17, Line 76: getFileMetadata
nit: can we use a better name? getFileMetadata() sounds like getting the file 
descriptors. Something like 'isLoadPartitionFileMetadata' or 
'isSetLoadPartitionFileMetadata' might be better.


http://gerrit.cloudera.org:8080/#/c/21437/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21437/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7279
PS13, Line 7279: he createEventId for the partitions.
> Is the new name any better?
Done


http://gerrit.cloudera.org:8080/#/c/21437/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21437/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7260
PS16, Line 7260: ta(true).setL
> I wanted to pass by the value 'partsToCreate' so that the original value of 
> this is intact for future reference down in the method.

I don't see this HdfsTableLoadParams object is used anymore after this preload. 
The set is not added into the HdfsTable so it can't be used anymore. Or did I 
miss any usages down in this method?


http://gerrit.cloudera.org:8080/#/c/21437/17/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21437/17/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7260
PS17, Line 7260: .reuseMetadata(true).setLoadPartitionFileMetadata(false)
               :               
.setLoadTableSchema(true).setRefreshUpdatedPartitions(true)
               :               .setPartitionsToUpdate(new 
HashSet<>(partsToCreate))
               :               
.setDebugAction(update.getDebug_action()).setReason("Preload for INSERT")
               :               
.setIsPreLoadForInsert(true).setCatalogTimeline(catalogTimeline).build()
nit: it'd be more clear to put one method at a line.

          ((HdfsTable) table).load(
              new HdfsTableLoadParamsBuilder(msClient.getHiveClient(), msTbl)
                  .reuseMetadata(true)
                  .setLoadPartitionFileMetadata(false)
                  .setLoadTableSchema(true)
                  .setRefreshUpdatedPartitions(true)
                  .setPartitionsToUpdate(new HashSet<>(partsToCreate))
                  .setDebugAction(update.getDebug_action())
                  .setReason("Preload for INSERT")
                  .setIsPreLoadForInsert(true)
                  .setCatalogTimeline(catalogTimeline)
                  .build());


http://gerrit.cloudera.org:8080/#/c/21437/17/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/21437/17/tests/custom_cluster/test_events_custom_configs.py@1295
PS17, Line 1295:           .format(unique_database, test_table))
Can we add another new and non-empty partition in Hive? Just want to see if 
it's visible in Impala after the INSERT.



--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 17
Gerrit-Owner: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Comment-Date: Fri, 12 Jul 2024 08:39:45 +0000
Gerrit-HasComments: Yes

Reply via email to