d8tltanc commented on pull request #8421:
URL: https://github.com/apache/kafka/pull/8421#issuecomment-637300554


   @skaundinya15 @ijuma @abbccdda Thanks for all the feedback and comments. 
This patch was made when I was new to Kafka. It's kind of naive to me at this 
time as I gained more insights into Kafka. Let me talk about two of my major 
concerns and thoughts about implementing the universal client exponential 
backoff.
   
   **AdminClient logic redundant**
   
   The main request flow difference btw AdminClient and normal clients (e.g. 
Producer and Consumer) would be that AdminClient wants to have a per request 
timeout while normal clients is okay with a static default timeout. Thus, 
AdminClient rewrote a quite amount of NetworkClient's functionality.
   
   For example,
   
   1. Wrapping the request builder into a new class `Call`, (the construction 
lambda adds tons of lines into the AdminClient.java, which should probably have 
been living in each AbstractRequest implementation classes files)
   2. Re-writing the request queues for different request status, while normal 
clients are fully using the NetworkClient.
   
   These logics will become redundant after we support exponential backoff in 
NetworkClient for all types of clients. Are we considering refactoring the 
AdminClient further and remove all the redundant logic which should have 
belonged to the networking layer and the AbstractRequest implementation classes?
   
   **Flexible backoff modes**
   
   Let's analyze the request backoff demands of all the types of clients at 
this point. In my opinion, there are simply two:
   
   1. Requests do not need exponential backoff. These requests need to be sent 
ASAP to avoid dataflow performance degradation, such as the `ProduceRequest` 
and its related/preceding metadata requests.
   
   2. Request do need exponential backoff. These requests are “second-class 
citizens” and can be throttled to avoid request storms on the broker side. Such 
as metadata related requests in AdminClient.
   
   Now the question comes. Even when two requests are of the same request type, 
one may have to get sent ASAP while the other one may wait, depending on the 
use case. We need to think deeper about how to make a classification.
   
   But the implementation would be simple. We can utilize the existing builder 
pattern AbstractRequest and build the request flexibly upon a given 
retry_backoff mode. For example, 
   
   1. AbstractRequest.Builder will interact with a new abstract class 
specifying the retry_backoff option, static or exponential. 
   2. AbstractRequest will have some new interfaces controlling the backoff. 
   
   Then, we can control if the request should have a static backoff or an 
exponential backoff when we construct each implementation instance of 
AbstractRequest.Builder. 
   
   
   I'll include more details in the Jira ticket and rewrite this PR. Before we 
talk more about the code details and start the new implementation, please let 
me know what you think about the AdminClient refactor and static/exponential 
retry_backoff classification rule. 
   
   As @abbccdda suggests, let's re-direct our further discussion to 
[Jira](https://issues.apache.org/jira/browse/KAFKA-9800) Thanks.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to