[ https://issues.apache.org/jira/browse/KAFKA-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cheng Tan updated KAFKA-9800: ----------------------------- Description: Design: The main idea is to bookkeep the failed attempt. Currently, the retry backoff has two main usage patterns: # Synchronous retires and blocking loop. The thread will sleep in each iteration for # Async retries. In each polling, the retries do not meet the backoff will be filtered. The data class often maintains a 1:1 mapping to a set of requests which are logically associated. (i.e. contains only one initial request and only its retries.) For type 1, we can utilize a local failure counter of a Java generic data type. For case 2, we can make those classes containing retriable data inherit from an abstract class Retriable. Retriable will implement interfaces recording the number of failed attempts. I already wrapped the exponential backoff/timeout util class in my KIP-601 [implementation|https://github.com/apache/kafka/pull/8683/files#diff-9ca2b1294653dfa914b9277de62b52e3R28] which takes the number of failures and returns the backoff/timeout value at the corresponding level. Currently, the retry backoff has two main usage patterns. # For those async retries, the data often stays in a queue. We will make the class inherit from the {{Retriable}} and record failure when a {{RetriableException}} happens. # For those synchronous retires, the backoff is often implemented in a blocking poll/loop, we won’t need the inheritance and will just record the failed attempts using a local variable of generic data type (Long). Producer side: # Produce request (API_KEY.PRODUCE). Currently, the backoff applies to each ProducerBatch in Accumulator, which already has an attribute attempts recording the number of failed attempts. # Transaction request (API_KEY..*TXN). TxnRequestHandler will inherit from Retriable and record each failed attempts, which will . # {{}} was: Design: The main idea is to bookkeep the failed attempts. Making those class containing retriable data inherit from an abstract class {{Retriable.}}This class will record the number of failed attempts. I already wrapped the exponential backoff/timeout util class in my KIP-601 [implementation|https://github.com/apache/kafka/pull/8683/files#diff-9ca2b1294653dfa914b9277de62b52e3R28] which takes the and returns the backoff/timeout value at the corresponding level. There’re two main usage patterns. {{}} {{}} # For those async retries, the data often stays in a queue. We will make the class inherit from the {{Retriable}} and record failure when a {{RetriableException}} happens. # For those synchronous retires, the backoff is often implemented in a blocking poll/loop, we won’t need the inheritance and will just record the failedAttempts using a local variable of generic data type (Long). Producer side: # Produce request (API_KEY.PRODUCE). Currently, the backoff applies to each ProducerBatch, which already has an attribute attempts recording the number of failed attempts. # {{}} > [KIP-580] Client Exponential Backoff Implementation > --------------------------------------------------- > > Key: KAFKA-9800 > URL: https://issues.apache.org/jira/browse/KAFKA-9800 > Project: Kafka > Issue Type: New Feature > Reporter: Cheng Tan > Assignee: Cheng Tan > Priority: Major > Labels: KIP-580 > > Design: > The main idea is to bookkeep the failed attempt. Currently, the retry backoff > has two main usage patterns: > # Synchronous retires and blocking loop. The thread will sleep in each > iteration for > # Async retries. In each polling, the retries do not meet the backoff will > be filtered. The data class often maintains a 1:1 mapping to a set of > requests which are logically associated. (i.e. contains only one initial > request and only its retries.) > For type 1, we can utilize a local failure counter of a Java generic data > type. > For case 2, we can make those classes containing retriable data inherit from > an abstract class Retriable. Retriable will implement interfaces recording > the number of failed attempts. I already wrapped the exponential > backoff/timeout util class in my KIP-601 > [implementation|https://github.com/apache/kafka/pull/8683/files#diff-9ca2b1294653dfa914b9277de62b52e3R28] > which takes the number of failures and returns the backoff/timeout value at > the corresponding level. > Currently, the retry backoff has two main usage patterns. > # For those async retries, the data often stays in a queue. We will make the > class inherit from the {{Retriable}} and record failure when a > {{RetriableException}} happens. > # For those synchronous retires, the backoff is often implemented in a > blocking poll/loop, we won’t need the inheritance and will just record the > failed attempts using a local variable of generic data type (Long). > Producer side: > # Produce request (API_KEY.PRODUCE). Currently, the backoff applies to each > ProducerBatch in Accumulator, which already has an attribute attempts > recording the number of failed attempts. > # Transaction request (API_KEY..*TXN). TxnRequestHandler will inherit from > Retriable and record each failed attempts, which will . > # > > > {{}} -- This message was sent by Atlassian Jira (v8.3.4#803005)