[GitHub] [flink] empcl commented on a diff in pull request #19573: [FLINK-27384] solve the problem that the latest data cannot be read under the creat…

GitBox Thu, 28 Apr 2022 09:57:52 -0700


empcl commented on code in PR #19573:
URL: https://github.com/apache/flink/pull/19573#discussion_r861119613



##########
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/read/HivePartitionFetcherContextBase.java:
##########
@@ -139,16 +138,9 @@ public List<ComparablePartitionValue> 
getComparablePartitionValueList() throws E
                                 tablePath.getDatabaseName(),
                                 tablePath.getObjectName(),
                                 Short.MAX_VALUE);
-                List<String> newNames =
-                        partitionNames.stream()
-                                .filter(
-                                        n ->
-                                                
!partValuesToCreateTime.containsKey(
-                                                        
extractPartitionValues(n)))
-                                .collect(Collectors.toList());
                 List<Partition> newPartitions =
                         metaStoreClient.getPartitionsByNames(
-                                tablePath.getDatabaseName(), 
tablePath.getObjectName(), newNames);
+                                tablePath.getDatabaseName(), 
tablePath.getObjectName(), partitionNames);

Review Comment:
   @luoyuxia Hi, this is a great idea. However, this method 
getComparablePartitionValueList() returns all partitions in the current 
directory. The comparison and selection of the required partition information 
is done in the outer layer of the method. In addition, this method does not 
recommend directly returning the required partition information. Because in the 
case of bounded flow, all partition information is obtained by this method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] empcl commented on a diff in pull request #19573: [FLINK-27384] solve the problem that the latest data cannot be read under the creat…

Reply via email to