vingov commented on code in PR #5294: URL: https://github.com/apache/hudi/pull/5294#discussion_r848111267
########## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java: ########## @@ -477,14 +479,19 @@ private Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchFromSourc } boolean shouldCombine = cfg.filterDupes || cfg.operation.equals(WriteOperationType.UPSERT); + boolean isDropPartitionColumns = props.getBoolean(DROP_PARTITION_COLUMNS.key()); + String partitionColumns = HoodieSparkUtils.getPartitionColumns(keyGenerator, props); + List<String> listOfPartitionColumns = Arrays.asList(partitionColumns.split(",")); JavaRDD<GenericRecord> avroRDD = avroRDDOptional.get(); + HoodieKey key = keyGenerator.getKey(avroRDD.first()); JavaRDD<HoodieRecord> records = avroRDD.map(gr -> { + gr = isDropPartitionColumns ? HoodieAvroUtils.removeFields(gr, listOfPartitionColumns) : gr; Review Comment: There are no tests for the whole `readFromSource` or `fetchFromSource`, it will take more time to add UT for the entire method, can I take it up as a separate task, since this PR is a blocker for this release. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org