Re: [PR] KAFKA-19820: remove the unnecessary copy from AbstractFetch#fetchablePartitions [kafka]

via GitHub Mon, 27 Oct 2025 21:49:01 -0700


TaiJuWu commented on PR #20745:
URL: https://github.com/apache/kafka/pull/20745#issuecomment-3454546799


   > Thanks for the PR.
   > 
   > Is there any reason not to make this change at 
`SubscriptionState.fetchablePartitions()`? That is...
   > 
   > ```diff
   > @@ -482,9 +482,9 @@ public class SubscriptionState {
   >      }
   >  
   >      // Visible for testing
   > -    public synchronized List<TopicPartition> 
fetchablePartitions(Predicate<TopicPartition> isAvailable) {
   > +    public synchronized Set<TopicPartition> 
fetchablePartitions(Predicate<TopicPartition> isAvailable) {
   >          // Since this is in the hot-path for fetching, we do this instead 
of using java.util.stream API
   > -        List<TopicPartition> result = new ArrayList<>();
   > +        Set<TopicPartition> result = new HashSet<>();
   >          assignment.forEach((topicPartition, topicPartitionState) -> {
   >              // Cheap check is first to avoid evaluating the predicate if 
possible
   >              if 
((subscriptionType.equals(SubscriptionType.AUTO_TOPICS_SHARE) || 
isFetchableAndSubscribed(topicPartition, topicPartitionState))
   > ```
   > 
   > As @chia7712 mentioned 
[here](https://github.com/apache/kafka/pull/14359#issuecomment-3372367774), 
it's already a distinct set, so we could reduce ambiguity by addressing the 
problem at "the source."
   > 
   > That would still require the change to remove the duplicate set in 
`AbstractFetch` as well as a minor change to `ShareConsumeRequestManager` and 
`SubscriptionStateTest`.
   > 
   > Thoughts?
   
   If we do that change it will impact performance, in current version
   ```
   Benchmark                                                   (partitionCount) 
 (topicCount)  Mode  Cnt  Score   Error  Units
   SubscriptionStateBenchmark.testFetchablePartitions                        50 
         5000  avgt   15  8.994 ± 0.149  ms/op
   SubscriptionStateBenchmark.testHasAllFetchPositions                       50 
         5000  avgt   15  5.593 ± 0.475  ms/op
   SubscriptionStateBenchmark.testPartitionsNeedingValidation                50 
         5000  avgt   15  5.580 ± 0.074  ms/op
   JMH benchmarks done
   ```
   
   commit 1ba40720e4
   ```
   Benchmark                                                   (partitionCount) 
 (topicCount)  Mode  Cnt   Score   Error  Units
   SubscriptionStateBenchmark.testFetchablePartitions                        50 
         5000  avgt   15  36.920 ± 1.396  ms/op
   SubscriptionStateBenchmark.testHasAllFetchPositions                       50 
         5000  avgt   15   4.441 ± 0.110  ms/op
   SubscriptionStateBenchmark.testPartitionsNeedingValidation                50 
         5000  avgt   15   5.349 ± 0.052  ms/op
   JMH benchmarks done
   ```
   
   We can see the hot path performance is becoming poor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] KAFKA-19820: remove the unnecessary copy from AbstractFetch#fetchablePartitions [kafka]

Reply via email to