pdolif commented on issue #24694: URL: https://github.com/apache/pulsar/issues/24694#issuecomment-3263950936
This seems to be a general issue if the PulsarClient memory limit is disabled. According to the ClientBuilder docs, the memory limit can be disabled by setting it to 0. https://github.com/apache/pulsar/blob/0a949de4bfa3734194be87a8655763a4411be1b6/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ClientBuilder.java#L479-L491 This is what happens in the PerformanceConsumer (as long as no limit is explicitly provided using the --memory-limit argument). While the ClientConfigurationData has a default memory limit of 64M, the PerformanceBaseArguments don't have a default (i.e., it is 0). https://github.com/apache/pulsar/blob/0a949de4bfa3734194be87a8655763a4411be1b6/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ClientConfigurationData.java#L363-L367 https://github.com/apache/pulsar/blob/0a949de4bfa3734194be87a8655763a4411be1b6/pulsar-testclient/src/main/java/org/apache/pulsar/testclient/PerformanceBaseArguments.java#L109-L111 When the PulsarClient used by the PerformanceConsumer is created in PerfClientUtils, the memory limit from the arguments will overwrite the default of 64M: https://github.com/apache/pulsar/blob/0a949de4bfa3734194be87a8655763a4411be1b6/pulsar-testclient/src/main/java/org/apache/pulsar/testclient/PerfClientUtils.java#L73-L74 If no limit is given through --memory-limit, the default of 0 is applied, and the memory limit gets disabled. Now comes the actual bug. To decide whether to scale up, the consumer gets the memory usage percent from the MemoryLimitController. https://github.com/apache/pulsar/blob/0a949de4bfa3734194be87a8655763a4411be1b6/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerBase.java#L230-L235 Since the memory limit is 0 / disabled, the MemoryLimitController performs a division by 0: https://github.com/apache/pulsar/blob/0a949de4bfa3734194be87a8655763a4411be1b6/pulsar-client/src/main/java/org/apache/pulsar/client/impl/MemoryLimitController.java#L141-L143 Then, a usage of NaN is returned, and the consumer does not fall back to the `orElse(0d)`. The condition `usage < MEMORY_THRESHOLD_FOR_RECEIVER_QUEUE_SIZE_EXPANSION` can never be fulfilled. I have two proposals for how to fix this: 1. https://github.com/pdolif/pulsar/pull/14 2. https://github.com/pdolif/pulsar/pull/15 Please see the PR descriptions for the details. A test reproducing the issue can be found in both PRs. I would be happy to submit one of the PRs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
