Jason-Whitmore commented on issue #24789:
URL: https://github.com/apache/pulsar/issues/24789#issuecomment-3363788357

   I've looked into this issue (which seems related to #24508) and I think it's 
caused by a race condition when testing the `producer_exception` 
RetentionPolicy.
   
   The sequence of events is:
   1. When the `producer_exception` RetentionPolicy is being used, and the 
backlog is at capacity, a `ServerError.ProducerBlockedQuotaExceededException` 
error is sent to the producer (ServerCnx:1788)
   
   2. The producer receives this error and fails the pending messages 
(ProducerImpl:2023)
   
   3. In the test, an assert statement (ReplicatorTest:1049) tries to retrieve 
the time delay on the first message in the producer's pending messages.
   
   If step 2 completes before step 3, the time delay will return 0.0 (since 
there are no pending messages), failing the assert statement.
   If step 3 completes before step 2, the test will pass.
   
   
   I see 2 solutions:
   1. Remove the `producer_exception` RetentionPolicy from the test since its 
behavior (removing everything in the pending messages) seems inappropriate for 
the test. Perhaps a separate test could be written for this RetentionPolicy.
   
   2. Change the behavior of the RetentionPolicy such that the pending messages 
are not removed when the exception occurs.
   
   This is my first time working with Pulsar, so I'm not sure what the proper 
solution might be.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to