Hey Jiangjie, Thanks for the awesome summary, I think that really captures it. I agree, the question is exactly whether there would be a motivational use case we can brainstorm for those additional use cases?
-Jay On Thu, Mar 12, 2015 at 4:24 PM, Jiangjie Qin <j...@linkedin.com.invalid> wrote: > Hi folks, > > Since there are actually a long interleaved discussions in KAFKA-1660 and > KAFKA-1659, I summarized them in the rejected alternatives sections - the > section exists for a reason and I shouldn’t have always put None there. . > . It looks we took a while to agree upon the current interface. > > We basically has the following requirements: > > *Assuming request timeout is available* > > 1. Need to close a producer forcefully if a message send failed. > 2. Need to close a producer forcefully even if no message send failed. > 3. Need to close a producer elegantly but with a bounded waiting time. > 4. Need to close a producer forcefully after an arbitrary waiting time > other than request timeout. > > The options we have now are (): > A. close(timeout) - address all four. > B. Close() + abort.on.send.fail - address 1) and 3) > C. abort() + close() - address 1) and 2) and 3) > > So I think we can narrow the discussion to be whether 2) and 4) are valid > use case or not. As soon as we reach conclusion on what requirements need > to be met, we would have a conclusion to the solution. > I think 2) might be a valid because of some application level logic. > I am not sure if 4) is a valid requirement, but form KAFKA-1660, my > understanding is it is. > > Any idea? > > Jiangjie (Becket) Qin > > > On 3/12/15, 11:50 AM, "Jiangjie Qin" <j...@linkedin.com> wrote: > > >Hey Guozhang, > > > >Thanks for the comments. I updated the page as suggested. > >For 3), that’s right, I put this in java doc. Do you think we need to > >reject value other than -1? > >For 4), I think user will notice this easily because the thread will block > >and producer is not going to shutdown. About using close(-1) quietly in > >sender thread when close() is called, my concern is that user might not be > >aware of that. Maybe we can put an error messages if user call close() in > >sender thread? > > > >Thanks, > > > >Jiangjie (Becket) Qin > > > >On 3/12/15, 5:13 AM, "Guozhang Wang" <wangg...@gmail.com> wrote: > > > >>Hi Becket, > >> > >>Some comments on the wiki page: > >> > >>1. There are a few typos, for example "multivations", "wiithin". > >> > >>2. I think the main motivation could just be "current close() needs to > >>block on flushing all buffered data, however there are scenarios in which > >>producers would like to close without blocking on flushing data, or even > >>close immediately and make sure any buffered data are dropped instead of > >>sending out." You can probably give some examples for such scenarios. > >> > >>3. close(-1, TimeUnit.MILLISECONDS) => from the implementation it seems > >>any > >>negative value has the same semantics. > >> > >>4. In sender thread only close(-1, TimeUnit.MILLISECONDS) should be > >>called > >>=> this is not programmatically enforced. Shall we just try to enforce it > >>via, for example, checking if caller thread is the IO thread, and if yes > >>just use close(-1)? > >> > >>5. Proposed Changes => it seems you only talked about the close(-1) case, > >>how about close(>=0)? > >> > >>Guozhang > >> > >>On Thu, Mar 12, 2015 at 1:49 AM, Jiangjie Qin <j...@linkedin.com.invalid > > > >>wrote: > >> > >>> Hey Joe & Jay, > >>> > >>> Thanks for the comments on the voting thread. Since it seems we > >>>probably > >>> will have more discussion on this, I am just replying from the > >>>discussion > >>> thread here. > >>> I’ve updated the KIP page to make it less like half-baked, apologize > >>>for > >>> the rush... > >>> > >>> The contract in current KIP is: > >>> 1. close() - wait until all requests either are sent or reach request > >>> timeout. > >>> 2. close(-1, TimeUnit.MILLISECONDS) - close immediately > >>> 3. close(0, TimeUnit.MILLISECONDS) - equivalent to close(), i.e. Wait > >>> until all requests are sent or reach request timeout > >>> 4. close(5, TimeUnit.MILLISECONDS) - try the best to finish sending > >>>in 5 > >>> milliseconds, if something went wrong, just shutdown the producer > >>>anyway, > >>> my callback will handle the failures. > >>> > >>> About how we define what timeout value stands for, I actually struggled > >>>a > >>> little bit when wrote the patch. Intuitively, close(0) should mean > >>> immediately, however it seems that all the existing java class have > >>>this > >>> convention of timeout=0 means no timeout or never timeout > >>>(Thread.join(0), > >>> Object.wait(0), etc.) So here the dilemma is either we follow the > >>> intuition or we follow the convention. What I chose is to follow the > >>> convention but document the interface to let user be aware of the > >>>usage. > >>> The reason is that I think producer.close() is a public interface so it > >>> might be better to follow java convention. Whereas selector is not a > >>> public interface that used by user, so as long as it makes sense to us, > >>>it > >>> is less a problem to be different from java convention. That said since > >>> consumer.poll(timeout) is also a public interface, I think it also > >>>makes > >>> sense to make producer.close() to have the same definition of > >>> consumer.poll(timeout). > >>> > >>> The main argument for keeping a timeout in close would be separating > >>>the > >>> close timeout from request timeout, which probably makes sense. I would > >>> guess typically the request timeout would be long (e.g. 60 seconds) > >>> because we might want to consider retries with back off time. If we > >>>have > >>> multiple batches in accumulator, in worst case that could take up to > >>> several minutes to complete all the requests. But when we close a > >>> producer, we might not want to wait for that long as it might cause > >>>some > >>> other problem like deployment tool timeout. > >>> > >>> There is also a subtle difference between close(timeout) and > >>> flush(timeout). The only purpose for flush() is to write data to the > >>> broker, so it makes perfect sense to wait until request timeout. I > >>>think > >>> that is why flush(timeout) looks strange. On the other hand, the top > >>> priority for close() is to close the producer rather than flush() data, > >>>so > >>> close(timeout) gives guarantee on bounded waiting for its main job. > >>> > >>> Sorry for the confusion about forceClose flag. It is not a public > >>> interface. I mentioned it in Proposed Changes section which I thought > >>>was > >>> supposed to provide implementation details. > >>> > >>> Thanks again for all the comments and suggestions! > >>> > >>> Jiangjie (Becket) Qin > >>> > >>> On 3/10/15, 8:57 PM, "Jiangjie Qin" <j...@linkedin.com> wrote: > >>> > >>> >The KIP page has been updated per Jay¹s comments. > >>> >I¹d like to initiate the voting process if no further comments are > >>> >received by tomorrow. > >>> > > >>> >Jiangjie (Becket) Qin > >>> > > >>> >On 3/8/15, 9:45 AM, "Jay Kreps" <jay.kr...@gmail.com> wrote: > >>> > > >>> >>Hey Jiangjie, > >>> >> > >>> >>Can you capture the full motivation and use cases for the feature? > >>>This > >>> >>mentions your interest in having a way of aborting from inside the > >>> >>Callback. But it doesn't really explain that usage or why other > >>>people > >>> >>would want to do that. It also doesn't list the primary use case for > >>> >>having > >>> >>close with a bounded timeout which was to avoid blocking too long on > >>> >>shutdown. > >>> >> > >>> >>-Jay > >>> >> > >>> >> > >>> >> > >>> >>On Sat, Mar 7, 2015 at 12:25 PM, Jiangjie Qin > >>><j...@linkedin.com.invalid > >>> > > >>> >>wrote: > >>> >> > >>> >>> Hi, > >>> >>> > >>> >>> I just created a KIP for adding a close(timeout) to new producer. > >>>Most > >>> >>>of > >>> >>> the previous discussions are in KAFKA-1660 where Parth Brahmbhatt > >>>has > >>> >>> already done a lot of work. > >>> >>> Since this is an interface change so we are going through the KIP > >>> >>>process. > >>> >>> Here is the KIP link: > >>> >>> > >>> >>> > >>> > >>> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=5373978 > >>> >>>2 > >>> >>> > >>> >>> Thanks. > >>> >>> > >>> >>> Jiangjie (Becket) Qin > >>> >>> > >>> > > >>> > >>> > >> > >> > >>-- > >>-- Guozhang > > > >