TaiJuWu commented on PR #20745:
URL: https://github.com/apache/kafka/pull/20745#issuecomment-3454546799
> Thanks for the PR.
>
> Is there any reason not to make this change at
`SubscriptionState.fetchablePartitions()`? That is...
>
> ```diff
> @@ -482,9 +482,9 @@ public class SubscriptionState {
> }
>
> // Visible for testing
> - public synchronized List<TopicPartition>
fetchablePartitions(Predicate<TopicPartition> isAvailable) {
> + public synchronized Set<TopicPartition>
fetchablePartitions(Predicate<TopicPartition> isAvailable) {
> // Since this is in the hot-path for fetching, we do this instead
of using java.util.stream API
> - List<TopicPartition> result = new ArrayList<>();
> + Set<TopicPartition> result = new HashSet<>();
> assignment.forEach((topicPartition, topicPartitionState) -> {
> // Cheap check is first to avoid evaluating the predicate if
possible
> if
((subscriptionType.equals(SubscriptionType.AUTO_TOPICS_SHARE) ||
isFetchableAndSubscribed(topicPartition, topicPartitionState))
> ```
>
> As @chia7712 mentioned
[here](https://github.com/apache/kafka/pull/14359#issuecomment-3372367774),
it's already a distinct set, so we could reduce ambiguity by addressing the
problem at "the source."
>
> That would still require the change to remove the duplicate set in
`AbstractFetch` as well as a minor change to `ShareConsumeRequestManager` and
`SubscriptionStateTest`.
>
> Thoughts?
If we do that change it will impact performance, in current version
```
Benchmark (partitionCount)
(topicCount) Mode Cnt Score Error Units
SubscriptionStateBenchmark.testFetchablePartitions 50
5000 avgt 15 8.994 ± 0.149 ms/op
SubscriptionStateBenchmark.testHasAllFetchPositions 50
5000 avgt 15 5.593 ± 0.475 ms/op
SubscriptionStateBenchmark.testPartitionsNeedingValidation 50
5000 avgt 15 5.580 ± 0.074 ms/op
JMH benchmarks done
```
commit 1ba40720e4
```
Benchmark (partitionCount)
(topicCount) Mode Cnt Score Error Units
SubscriptionStateBenchmark.testFetchablePartitions 50
5000 avgt 15 36.920 ± 1.396 ms/op
SubscriptionStateBenchmark.testHasAllFetchPositions 50
5000 avgt 15 4.441 ± 0.110 ms/op
SubscriptionStateBenchmark.testPartitionsNeedingValidation 50
5000 avgt 15 5.349 ± 0.052 ms/op
JMH benchmarks done
```
We can see the hot path performance is becoming poor.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]