Hi Jay, I have modified the KIP as you suggested. I thinks as long as we have consistent define for timeout across Kafka interface, there would be no problem. And I also agree it is better if we can make producer block when close() is called from sender thread so user will notice something went wrong.
Thanks. Jiangjie (Becket) Qin On 3/14/15, 11:37 AM, "Jay Kreps" <jay.kr...@gmail.com> wrote: >Hey Jiangjie, > >I think this is going to be very confusing that > close(0) waits indefinitely and > close(-1) waits for 0. >I understand this appears in other apis, but it is a constant cause of >bugs. Let's not repeat that mistake. > >Let's make close(0) wait for 0. We don't need a way to wait indefinitely >as >we already have close() so having a magical constant for that is >redundant. > >Calling close() from the I/O thread was already possible and would block >indefinitely. I think trying to silently change the behavior is probably >not right. I.e. if the user calls close() in the callback there is >actually >some misunderstanding and they need to think more, silently making this >not >block will hide the problem from them which is the opposite of what we >want. > >-Jay > >On Thu, Mar 12, 2015 at 1:49 AM, Jiangjie Qin <j...@linkedin.com.invalid> >wrote: > >> Hey Joe & Jay, >> >> Thanks for the comments on the voting thread. Since it seems we probably >> will have more discussion on this, I am just replying from the >>discussion >> thread here. >> I’ve updated the KIP page to make it less like half-baked, apologize for >> the rush... >> >> The contract in current KIP is: >> 1. close() - wait until all requests either are sent or reach request >> timeout. >> 2. close(-1, TimeUnit.MILLISECONDS) - close immediately >> 3. close(0, TimeUnit.MILLISECONDS) - equivalent to close(), i.e. Wait >> until all requests are sent or reach request timeout >> 4. close(5, TimeUnit.MILLISECONDS) - try the best to finish sending >>in 5 >> milliseconds, if something went wrong, just shutdown the producer >>anyway, >> my callback will handle the failures. >> >> About how we define what timeout value stands for, I actually struggled >>a >> little bit when wrote the patch. Intuitively, close(0) should mean >> immediately, however it seems that all the existing java class have this >> convention of timeout=0 means no timeout or never timeout >>(Thread.join(0), >> Object.wait(0), etc.) So here the dilemma is either we follow the >> intuition or we follow the convention. What I chose is to follow the >> convention but document the interface to let user be aware of the usage. >> The reason is that I think producer.close() is a public interface so it >> might be better to follow java convention. Whereas selector is not a >> public interface that used by user, so as long as it makes sense to us, >>it >> is less a problem to be different from java convention. That said since >> consumer.poll(timeout) is also a public interface, I think it also makes >> sense to make producer.close() to have the same definition of >> consumer.poll(timeout). >> >> The main argument for keeping a timeout in close would be separating the >> close timeout from request timeout, which probably makes sense. I would >> guess typically the request timeout would be long (e.g. 60 seconds) >> because we might want to consider retries with back off time. If we have >> multiple batches in accumulator, in worst case that could take up to >> several minutes to complete all the requests. But when we close a >> producer, we might not want to wait for that long as it might cause some >> other problem like deployment tool timeout. >> >> There is also a subtle difference between close(timeout) and >> flush(timeout). The only purpose for flush() is to write data to the >> broker, so it makes perfect sense to wait until request timeout. I think >> that is why flush(timeout) looks strange. On the other hand, the top >> priority for close() is to close the producer rather than flush() data, >>so >> close(timeout) gives guarantee on bounded waiting for its main job. >> >> Sorry for the confusion about forceClose flag. It is not a public >> interface. I mentioned it in Proposed Changes section which I thought >>was >> supposed to provide implementation details. >> >> Thanks again for all the comments and suggestions! >> >> Jiangjie (Becket) Qin >> >> On 3/10/15, 8:57 PM, "Jiangjie Qin" <j...@linkedin.com> wrote: >> >> >The KIP page has been updated per Jay¹s comments. >> >I¹d like to initiate the voting process if no further comments are >> >received by tomorrow. >> > >> >Jiangjie (Becket) Qin >> > >> >On 3/8/15, 9:45 AM, "Jay Kreps" <jay.kr...@gmail.com> wrote: >> > >> >>Hey Jiangjie, >> >> >> >>Can you capture the full motivation and use cases for the feature? >>This >> >>mentions your interest in having a way of aborting from inside the >> >>Callback. But it doesn't really explain that usage or why other people >> >>would want to do that. It also doesn't list the primary use case for >> >>having >> >>close with a bounded timeout which was to avoid blocking too long on >> >>shutdown. >> >> >> >>-Jay >> >> >> >> >> >> >> >>On Sat, Mar 7, 2015 at 12:25 PM, Jiangjie Qin >><j...@linkedin.com.invalid >> > >> >>wrote: >> >> >> >>> Hi, >> >>> >> >>> I just created a KIP for adding a close(timeout) to new producer. >>Most >> >>>of >> >>> the previous discussions are in KAFKA-1660 where Parth Brahmbhatt >>has >> >>> already done a lot of work. >> >>> Since this is an interface change so we are going through the KIP >> >>>process. >> >>> Here is the KIP link: >> >>> >> >>> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=5373978 >> >>>2 >> >>> >> >>> Thanks. >> >>> >> >>> Jiangjie (Becket) Qin >> >>> >> > >> >>