mynameborat commented on a change in pull request #951: SAMZA-2127: Upgrade to
Kafka 2.0
URL: https://github.com/apache/samza/pull/951#discussion_r268893914
##########
File path:
samza-kafka/src/main/java/org/apache/samza/system/kafka/KafkaSystemAdmin.java
##########
@@ -628,10 +589,12 @@ public void validateStream(StreamSpec streamSpec) throws
StreamValidationExcepti
@Override
public void deleteMessages(Map<SystemStreamPartition, String> offsets) {
if (deleteCommittedMessages) {
- if (adminClientForDelete == null) {
- adminClientForDelete =
kafka.admin.AdminClient.create(createAdminClientProperties());
- }
- KafkaSystemAdminUtilsScala.deleteMessages(adminClientForDelete, offsets);
+ Map<TopicPartition, RecordsToDelete> recordsToDelete = offsets.entrySet()
+ .stream()
+ .collect(Collectors.toMap(entry ->
+ new TopicPartition(entry.getKey().getStream(),
entry.getKey().getPartition().getPartitionId()),
+ entry ->
RecordsToDelete.beforeOffset(Long.parseLong(entry.getValue()) + 1)));
+ adminClient.deleteRecords(recordsToDelete);
Review comment:
Good point. I am caught up on this since the existing behavior wasn't
blocking either and does a best effort on the delete. I debated between
retaining existing behavior vs blocking. Since blocking on this would mean
blocking commit thread which does seem significant especially given this is an
optimization to reduce the data footprint for our intermediate topics.
Let me log an error in case of failures on this future.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services