[ https://issues.apache.org/jira/browse/KAFKA-16651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mike Pedersen updated KAFKA-16651: ---------------------------------- Description: In the JavaDoc for {{KafkaProducer#send(ProducerRecord, Callback)}}, it claims that it will throw a {{TimeoutException}} if blocking on fetching metadata or allocating memory and surpassing {{max.block.ms}}. {quote}Throws: {{TimeoutException}} - If the time taken for fetching metadata or allocating memory for the record has surpassed max.block.ms.{quote} ([link|https://kafka.apache.org/36/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html#send(org.apache.kafka.clients.producer.ProducerRecord,org.apache.kafka.clients.producer.Callback)]) But this is not the case. As {{TimeoutException}} is an {{ApiException}} it will hit [this catch|https://github.com/a0x8o/kafka/blob/54eff6af115ee647f60129f2ce6a044cb17215d0/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L1073-L1084] which will result in a failed future being returned instead of the exception being thrown. The "allocating memory" part likely changed as part of [KAFKA-3720|https://github.com/apache/kafka/pull/8399/files#diff-43491ffa1e0f8d28db071d8c23f1a76b54f1f20ea98cf6921bfd1c77a90446abR29] which changed the base exception for buffer exhaustion exceptions to {{TimeoutException}}. Timing out waiting on metadata suffers the same issue, but it is not clear whether this has always been the case. This is basically a discrepancy between documentation and behavior, so it's a question of which one should be adjusted. And on that, being able to differentiate between synchronous timeouts (as caused by waiting on metadata or allocating memory) and asynchronous timeouts (eg. timing out waiting for acks) is useful. In the former case we _know_ that the broker has not received the event but in the latter it _may_ be that the broker has received it but the ack could not be delivered, and our actions might vary because of this. The current behavior makes this hard to differentiate since both result in a {{TimeoutException}} being delivered via the callback. Currently, we are relying on the exception message, but this is basically just relying on implementation detail that may change at any time. Therefore I would suggest to either: * Revert to the documented behavior of throwing in case of synchronous timeouts * Correct the javadoc and introduce an exception base class/interface for synchronous timeouts was: In the JavaDoc for {{KafkaProducer#send(ProducerRecord, Callback)}}, it claims that it will throw a {{TimeoutException}} if blocking on fetching metadata or allocating memory and surpassing {{max.block.ms}}. bq. Throws: bq. {{TimeoutException}} - If the time taken for fetching metadata or allocating memory for the record has surpassed max.block.ms. ([link|https://kafka.apache.org/36/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html#send(org.apache.kafka.clients.producer.ProducerRecord,org.apache.kafka.clients.producer.Callback)]) But this is not the case. As {{TimeoutException}} is an {{ApiException}} it will hit [this catch|https://github.com/a0x8o/kafka/blob/54eff6af115ee647f60129f2ce6a044cb17215d0/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L1073-L1084] which will result in a failed future being returned instead of the exception being thrown. The "allocating memory" part likely changed as part of [KAFKA-3720|https://github.com/apache/kafka/pull/8399/files#diff-43491ffa1e0f8d28db071d8c23f1a76b54f1f20ea98cf6921bfd1c77a90446abR29] which changed the base exception for buffer exhaustion exceptions to {{TimeoutException}}. Timing out waiting on metadata suffers the same issue, but it is not clear whether this has always been the case. This is basically a discrepancy between documentation and behavior, so it's a question of which one should be adjusted. And on that, being able to differentiate between synchronous timeouts (as caused by waiting on metadata or allocating memory) and asynchronous timeouts (eg. timing out waiting for acks) is useful. In the former case we _know_ that the broker has not received the event but in the latter it _may_ be that the broker has received it but the ack could not be delivered, and our actions might vary because of this. The current behavior makes this hard to differentiate since both result in a {{TimeoutException}} being delivered via the callback. Currently, we are relying on the exception message, but this is basically just relying on implementation detail that may change at any time. Therefore I would suggest to either: * Revert to the documented behavior of throwing in case of synchronous timeouts * Correct the javadoc and introduce an exception base class/interface for synchronous timeouts > KafkaProducer.send does not throw TimeoutException as documented > ---------------------------------------------------------------- > > Key: KAFKA-16651 > URL: https://issues.apache.org/jira/browse/KAFKA-16651 > Project: Kafka > Issue Type: Bug > Components: producer > Affects Versions: 3.6.2 > Reporter: Mike Pedersen > Priority: Major > > In the JavaDoc for {{KafkaProducer#send(ProducerRecord, Callback)}}, it > claims that it will throw a {{TimeoutException}} if blocking on fetching > metadata or allocating memory and surpassing {{max.block.ms}}. > {quote}Throws: > {{TimeoutException}} - If the time taken for fetching metadata or allocating > memory for the record has surpassed max.block.ms.{quote} > ([link|https://kafka.apache.org/36/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html#send(org.apache.kafka.clients.producer.ProducerRecord,org.apache.kafka.clients.producer.Callback)]) > But this is not the case. As {{TimeoutException}} is an {{ApiException}} it > will hit [this > catch|https://github.com/a0x8o/kafka/blob/54eff6af115ee647f60129f2ce6a044cb17215d0/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L1073-L1084] > which will result in a failed future being returned instead of the exception > being thrown. > The "allocating memory" part likely changed as part of > [KAFKA-3720|https://github.com/apache/kafka/pull/8399/files#diff-43491ffa1e0f8d28db071d8c23f1a76b54f1f20ea98cf6921bfd1c77a90446abR29] > which changed the base exception for buffer exhaustion exceptions to > {{TimeoutException}}. Timing out waiting on metadata suffers the same issue, > but it is not clear whether this has always been the case. > This is basically a discrepancy between documentation and behavior, so it's a > question of which one should be adjusted. > And on that, being able to differentiate between synchronous timeouts (as > caused by waiting on metadata or allocating memory) and asynchronous timeouts > (eg. timing out waiting for acks) is useful. In the former case we _know_ > that the broker has not received the event but in the latter it _may_ be that > the broker has received it but the ack could not be delivered, and our > actions might vary because of this. The current behavior makes this hard to > differentiate since both result in a {{TimeoutException}} being delivered via > the callback. Currently, we are relying on the exception message, but this is > basically just relying on implementation detail that may change at any time. > Therefore I would suggest to either: > * Revert to the documented behavior of throwing in case of synchronous > timeouts > * Correct the javadoc and introduce an exception base class/interface for > synchronous timeouts -- This message was sent by Atlassian Jira (v8.20.10#820010)