vvcephei commented on pull request #8677:
URL: https://github.com/apache/kafka/pull/8677#issuecomment-635707062


   Hey @abbccdda , I've just recently been in some investigation of these 
timeouts as part of https://github.com/apache/kafka/pull/8738 , and we're also 
planning to implement KIP-572 as a general solution to all timeouts that can 
happen in Streams.
   
   Given the complexities that came to light in the discussion above, and all 
the edge cases that can happen, I'm wondering if we should really try to be 
this smart in the assignor.
   
   What do you think about just leaving the current behavior as-is, and then in 
the future, changing it to throw the TimeoutException out of assign() so that 
the KIP-572 logic can catch it and gracefully retry from the outer loop? The 
downside of that approach is that all the instances would be blocked for the 
whole poll interval, and then they would have to repeat their attempt to join 
the group.
   
   I'm just concerned that it doesn't sound from the above like we're very sure 
that any specific choice of tasks is going to be the right one, and if we leave 
some tasks out of the assignment, it's going to be harder to debug than if we 
just let the thread crash (for now) or recover holistically (after KIP-572).
   
   WDYT?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to