[ 
https://issues.apache.org/jira/browse/KAFKA-8726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17041251#comment-17041251
 ] 

Guozhang Wang commented on KAFKA-8726:
--------------------------------------

OUT_OF_ORDER_SEQUENCE_NUMBER is a fatal error, and it seems in [~mbarbon]'s 
case it was caused by `MESSAGE_TOO_LARGE`. In an transactional producer case, 
the record `MESSAGE_TOO_LARGE` cannot be retried and hence the transaction it 
sits in has to be aborted.

But I think this `error-state` is not a fatal one: note that inside Producer we 
have both fatal error-state and abortable error-state, and normally 
`OUT_OF_ORDER_SEQUENCE_NUMBER` would only cause us to transit to abortable 
error-state, in which case the producer do not need to be closed, but we can 
still call `abortTxn` and the start a new transaction with that producer 
instead of closing it and creating a new one.

> Producer can't abort a transaction aftersome send errors
> --------------------------------------------------------
>
>                 Key: KAFKA-8726
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8726
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, producer 
>    Affects Versions: 2.3.0
>            Reporter: Mattia Barbon
>            Priority: Major
>
> I am following the producer with transactions example in 
> [https://kafka.apache.org/23/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html,]
>  and on kafkaException, I use abortTransaction and retry.
>  
> In some cases, abortTransaction fails, with:
> ```
> org.apache.kafka.common.KafkaException: Cannot execute transactional method 
> because we are in an error state
> ```
> as far as I can tell, this is caused by
> ```
> org.apache.kafka.common.KafkaException: The client hasn't received 
> acknowledgment for some previously sent messages and can no longer retry 
> them. It isn't safe to continue.
> ```
>  
> Since both are KafkaException, the example seems to imply they are retriable, 
> but they seem not to be. Ideally, I would expect abortTransaction to succeed 
> in this case (the broker will abort the transaction anyway because it can't 
> be committed), but at the very least, I would expect to have a way to 
> determine that the producer is unusable and it can't recover.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to