Hello Seung-chan, Thanks for the KIP writeup and summary! I made a pass on it and want to share some of my thoughts:
On the very high level, we want to be able to effectively differentiate several cases as follows: 1) There's a network partition / soft failure hence clients cannot reach the broker, temporarily: here we want to give some time to see if the clients can reconnect back, and hence the timeout makes sense. 2) The client is being bounced, i.e. a shutdown followed by a restart: here we do not want to trigger any rebalance, but today we can only hope that the timeout is long enough to cover that bounce window. 3) The client is shutdown and won't be back: here we want to trigger the rebalance immediately, but today we'd have to wait for the timeout value. Today we use the timeout to try to tackle all three cases, but ideally we want the client to submit extra information to help distinguish them. I.e. we just use timeout for case 1) only, while we use separate mechanisms to differentiate 2) and 3) from it. Personally I think we could consider having an augmented leave-group (or maybe in the long run, we can merge that RPC as part of heartbeat) with a flag indicating 2) or 3), while just relying on the timeout for case 1). But to consider a narrower scope for this KIP that does not touch on protocol changes, I think just differentiate 2/3) by not sending leave-group for 2) v.s. sending leave-group for 3) is sufficient. As for the KIP itself, I have a few minor comments: 1. Regarding the API change, I feel just doing that on the streams side is not enough since by the end of the day we still need the consumer to incorporate it (today it's via a static config and hence we cannot just dynamically change the config). 2. Regarding the API itself, I want to see a more concrete proposal that contains the full signature, e.g. does".closeAndLeaveGroup()" include the timeout param as well, etc? My very subjective preference is to not differentiate by the function name in case we may want to augment the close function in the future, which would explode the function names :P Instead maybe we can just overload the `close()` function again but with a control object, that includes 1) timeout, 2) leave-group flag, and hence can also extend to include more variables just in case. Would like to hear others' thoughts as well. Guozhang On Wed, Jan 12, 2022 at 5:51 AM Seung-chan Ahn <dev.issea1...@gmail.com> wrote: > Hi team, > > Here's the new KIP > < > https://cwiki.apache.org/confluence/display/KAFKA/KIP-812%3A+Introduce+another+form+of+the+%60KafkaStreams.close%28%29%60+API+that+forces+the+member+to+leave+the+consumer+group > > > for this issue <https://issues.apache.org/jira/browse/KAFKA-13217>. > > The title says pretty much what this KIP is for. Even though it's my first > draft, as A. Sophie Blee-Goldman has written rich descriptions and already > the solutions in the issue thread, I've enjoyed following up on the idea. > > Please feel free to review on any point! > -- -- Guozhang