Hi Neha, As Joel said, (1) and (2) provides almost exact same function as far as I can see. I actually have no strong preference. The only difference is that (2) provides the close() semantic meaning of making sure all the resources have gone away, at the cost of adding another abort() interface. The simplicity of (1) is also attractive but at a slight cost of clarity. I am slightly leaning towards (2) but would also be OK if we pursue (1).
Thanks. Jiangjie (Becket) Qin On 3/19/15, 2:46 PM, "Joel Koshy" <jjkosh...@gmail.com> wrote: >(1) should work, but as Jun suggested earlier in the thread it is >slightly misleading. The (intuitive) post-condition of "close" is that >the producer has shutdown - i.e., its sender thread, closed its >metrics, serializer/deserializer, etc. That is not necessarily a >post-condition of "close(0)" although one can contend that if you call >the method in non-blocking mode (zero timeout) then it is reasonable >to not expect that post-condition. > >So I think that although (2) adds one more API it brings "simplicity >by virtue of overall clarity". > >I would be in favor of (2) but not strongly opposed to (1). > >Thanks, > >Joel > >On Thu, Mar 19, 2015 at 10:05:04AM -0700, Neha Narkhede wrote: >> I'm in favor of (1) for the sake of simplicity and as Jay mentions to >> reduce the number of different APIs. Can you explain when (1) does not >>work? >> >> On Wed, Mar 18, 2015 at 2:52 PM, Jay Kreps <jay.kr...@gmail.com> wrote: >> >> > Personally I'm in favor of (1) just to reduce the number of different >>APIs. >> > People will find the difference between abort and close subtle and >> > confusing and the only instance where you want it is this somewhat >>unusual >> > case you guys are pursuing, right? >> > >> > -Jay >> > >> > On Wed, Mar 18, 2015 at 2:13 PM, Jiangjie Qin >><j...@linkedin.com.invalid> >> > wrote: >> > >> > > It looks we have another option and are now deciding between the >> > following >> > > two interfaces: >> > > >> > > 1. Close() + close(timeout) >> > > - timeout could be either positive or zero. >> > > - only close(0) can be called from sender thread >> > > >> > > 2. Close() + abort() + close(timeout) >> > > - timeout can either be positive or zero >> > > - only abort() can be called from sender thread >> > > >> > > - abort() is equivalent to close(0) in 1) but does not join sender >> > > thread and does not close metrics. >> > > - Another thread has to call close() or close(timeout) in order >>to make >> > > sure the resources in producer are gone. >> > > >> > > The tow approach provides the same function we need, the difference >>is >> > > approach 2) follows convention of close() and abort(). On the other >>hand, >> > > approach 1) saves one interface compared with approach 2) but does >>not >> > > follow the convention. >> > > >> > > When the two approaches come to user code, it is probably something >>like >> > > this: >> > > >> > > Try { >> > > While(!finished) >> > > Producer.send(record, callback) >> > > } catch (Exception e) { >> > > Producer.close(5) >> > > } >> > > >> > > Class CallbackImpl implements Callback { >> > > onCompletion(RecordMetadata metadata Exception e) { >> > > If (e != null) >> > > Abort() / close() >> > > } >> > > } >> > > >> > > Because the two approach leads to almost the same user code, >>assuming >> > > users are always calling producer.close() as a clean up step, >>personally >> > I >> > > prefer approach 2) as it follows convention. >> > > >> > > Any thoughts? >> > > >> > > Jiangjie (Becket) Qin >> > > >> > > >> > > On 3/17/15, 10:25 AM, "Jiangjie Qin" <j...@linkedin.com> wrote: >> > > >> > > >Hi Jun, >> > > > >> > > >Yes, as Guozhang said, the main reason we set a flag is because >>close(0) >> > > >is expected to be called by sender thread itself. >> > > >If we want to maintain the semantic meaning of close(), one >>alternative >> > is >> > > >to have an abort() method does the same thing as close(0) except >> > cleanup. >> > > >And in close(timeout), after timeout we call abort() and join the >>sender >> > > >thread. This was one of the previous proposal. We merged abort to >> > close(0) >> > > >because they are almost doing the same thing. But from what you >> > mentioned, >> > > >it might make sense to have two separate methods. >> > > > >> > > >Thanks. >> > > > >> > > >Jiangjie (Becket) Qin >> > > > >> > > >On 3/16/15, 10:31 PM, "Guozhang Wang" <wangg...@gmail.com> wrote: >> > > > >> > > >>Yeah in this sense the sender thread will not exist immediately >>in the >> > > >>close(0) call, but will only terminate after the current response >>batch >> > > >>has >> > > >>been processed, as will the producer instance itself. >> > > >> >> > > >>There is a reason for this though: for a clean shutdown the caller >> > thread >> > > >>has to wait for the sender thread to join before closing the >>producer >> > > >>instance, but this cannot be achieve if close(0) is called by the >> > sender >> > > >>thread itself (for example in KAFKA-1659, there is a proposal from >> > Andrew >> > > >>Stein on using thread.interrupt and thread.stop, but if it is >>called by >> > > >>the >> > > >>ioThread itself the stop call will fail). Hence we came up with >>the >> > flag >> > > >>approach to let the sender thread to close as soon as it is at the >> > > >>barrier >> > > >>of the run loop. >> > > >> >> > > >>Guozhang >> > > >> >> > > >>On Mon, Mar 16, 2015 at 9:41 PM, Jun Rao <j...@confluent.io> wrote: >> > > >> >> > > >>> Hmm, does that mean that after close(0), the sender thread is >>not >> > > >>>necessary >> > > >>> gone? Normally, after closing an entity, we expect all internal >> > threads >> > > >>> associated with the entity are shut down completely. >> > > >>> >> > > >>> Thanks, >> > > >>> >> > > >>> Jun >> > > >>> >> > > >>> On Mon, Mar 16, 2015 at 3:18 PM, Jiangjie Qin >> > > >>><j...@linkedin.com.invalid> >> > > >>> wrote: >> > > >>> >> > > >>> > Hi Jun, >> > > >>> > >> > > >>> > Close(0) will set two flags in sender. Running=false and a >>newly >> > > >>>added >> > > >>> > forceClose=true. It will also set accumulator.closed=true so >>no >> > > >>>further >> > > >>> > producer.send() will succeed. >> > > >>> > The sender thread will finish executing all the callbacks in >> > current >> > > >>> batch >> > > >>> > of responses, then it will see the forceClose flag. It will >>just >> > fail >> > > >>>all >> > > >>> > the incomplete batches in the producer and exit. >> > > >>> > So close(0) is a non-blocking call and sender thread will not >>try >> > to >> > > >>>join >> > > >>> > itself in close(0). >> > > >>> > >> > > >>> > Thanks. >> > > >>> > >> > > >>> > Jiangjie (Becket) Qin >> > > >>> > >> > > >>> > On 3/16/15, 2:50 PM, "Jun Rao" <j...@confluent.io> wrote: >> > > >>> > >> > > >>> > >How does close(0) work if it's called from the sender >>thread? If >> > > >>> close(0) >> > > >>> > >needs to wait for the sender thread to join, wouldn't this >>cause a >> > > >>> > >deadlock? >> > > >>> > > >> > > >>> > >Thanks, >> > > >>> > > >> > > >>> > >Jun >> > > >>> > > >> > > >>> > >On Mon, Mar 16, 2015 at 2:26 PM, Jiangjie Qin >> > > >>><j...@linkedin.com.invalid >> > > >>> > >> > > >>> > >wrote: >> > > >>> > > >> > > >>> > >> Thanks Guozhang. It wouldn’t be as thoroughly considered >>without >> > > >>> > >> discussing with you :) >> > > >>> > >> >> > > >>> > >> Jiangjie (Becket) Qin >> > > >>> > >> >> > > >>> > >> On 3/16/15, 1:07 PM, "Guozhang Wang" <wangg...@gmail.com> >> > wrote: >> > > >>> > >> >> > > >>> > >> >Thanks Jiangjie, >> > > >>> > >> > >> > > >>> > >> >After talking to you offline on this, I have been >>convinced and >> > > >>> > >>changed my >> > > >>> > >> >preference to blocking. The immediate shutdown approach >>does >> > have >> > > >>> some >> > > >>> > >> >unsafeness in some cases. >> > > >>> > >> > >> > > >>> > >> >Guozhang >> > > >>> > >> > >> > > >>> > >> >On Mon, Mar 16, 2015 at 11:50 AM, Jiangjie Qin >> > > >>> > >><j...@linkedin.com.invalid >> > > >>> > >> > >> > > >>> > >> >wrote: >> > > >>> > >> > >> > > >>> > >> >> It looks that the problem we want to solve and the >>purpose we >> > > >>>want >> > > >>> to >> > > >>> > >> >> achieve is: >> > > >>> > >> >> If user uses close() in callback, we want to let user be >> > aware >> > > >>>that >> > > >>> > >>they >> > > >>> > >> >> should use close(0) instead of close() in the callback. >> > > >>> > >> >> >> > > >>> > >> >> We have agreed that we will have an error log to inform >>user >> > > >>>about >> > > >>> > >>this >> > > >>> > >> >> mis-usage. The options differ in the way how we can >>force >> > user >> > > >>>to >> > > >>> > >>take a >> > > >>> > >> >> look at that error log. >> > > >>> > >> >> There are two scenarios: >> > > >>> > >> >> 1. User does not expect the program to exit. >> > > >>> > >> >> 2. User expect the program to exit. >> > > >>> > >> >> >> > > >>> > >> >> For scenario 1), blocking will probably delay the >>discovery >> > of >> > > >>>the >> > > >>> > >> >> problem. Calling close(0) exposes the problem quicker. >>In >> > this >> > > >>> > >>scenario >> > > >>> > >> >> producer just encounter a send failure when running >>normally. >> > > >>> > >> >> For scenario 2), blocking will expose the problem quick. >> > > >>>Calling >> > > >>> > >> >>close(-1) >> > > >>> > >> >> might hide the problem. This scenario might include: a) >>Unit >> > > >>>test >> > > >>> > >>for a >> > > >>> > >> >> send failure. b) Message sending during a close() call >>from a >> > > >>>user >> > > >>> > >> >>thread. >> > > >>> > >> >> >> > > >>> > >> >> So as a summary table: >> > > >>> > >> >> >> > > >>> > >> >> Scenario 1) >> > Scenario >> > > >>>2) >> > > >>> > >> >> >> > > >>> > >> >> Blocking Delay problem discovery Guaranteed >> > > >>>problem >> > > >>> > >> >>discovery >> > > >>> > >> >> >> > > >>> > >> >> Close(-1) Immediate problem discovery Problem >>might >> > be >> > > >>> hidden >> > > >>> > >> >> >> > > >>> > >> >> >> > > >>> > >> >> Personally I prefer blocking because it seems providing >>more >> > > >>> > >>guarantees >> > > >>> > >> >> and safer. >> > > >>> > >> >> >> > > >>> > >> >> Thanks. >> > > >>> > >> >> >> > > >>> > >> >> Jiangjie (Becket) Qin >> > > >>> > >> >> >> > > >>> > >> >> >> > > >>> > >> >> On 3/16/15, 10:11 AM, "Guozhang Wang" >><wangg...@gmail.com> >> > > >>>wrote: >> > > >>> > >> >> >> > > >>> > >> >> >HI Jiangjie, >> > > >>> > >> >> > >> > > >>> > >> >> >As far as I understand calling close() in the ioThread >>is >> > not >> > > >>> > >>common, >> > > >>> > >> >>as >> > > >>> > >> >> >it >> > > >>> > >> >> >may only trigger when we saw some non-retriable error. >>Hence >> > > >>>when >> > > >>> > >>user >> > > >>> > >> >>run >> > > >>> > >> >> >their program it is unlikely that close() will be >>triggered >> > > >>>and >> > > >>> > >>problem >> > > >>> > >> >> >will be detected. So it seems to me that from the error >> > > >>>detection >> > > >>> > >> >>aspect >> > > >>> > >> >> >these two options seems to be the same as people will >> > usually >> > > >>> > >>detect it >> > > >>> > >> >> >from the producer metrics all dropping to 0. >> > > >>> > >> >> > >> > > >>> > >> >> >Guozhang >> > > >>> > >> >> > >> > > >>> > >> >> >On Mon, Mar 16, 2015 at 9:52 AM, Jiangjie Qin >> > > >>> > >> >><j...@linkedin.com.invalid> >> > > >>> > >> >> >wrote: >> > > >>> > >> >> > >> > > >>> > >> >> >> It seems there are two options we can choose from >>when >> > > >>>close() >> > > >>> is >> > > >>> > >> >>called >> > > >>> > >> >> >> from sender thread (callback): >> > > >>> > >> >> >> 1. Log an error and close the producer using >>close(-1) >> > > >>> > >> >> >> 2. Log an error and block. >> > > >>> > >> >> >> (Throwing an exception will not work because we >>catch all >> > > >>>the >> > > >>> > >> >>exception >> > > >>> > >> >> >> thrown from user callback. It will just lead to an >>error >> > > >>>log.) >> > > >>> > >> >> >> >> > > >>> > >> >> >> My concern for the first option is that the producer >>will >> > be >> > > >>> > >>closed >> > > >>> > >> >>even >> > > >>> > >> >> >> if we logged and error. I am wondering if some user >>would >> > > >>>not >> > > >>> even >> > > >>> > >> >>take >> > > >>> > >> >> >>a >> > > >>> > >> >> >> look at the log if producer is closed normally. >>Because >> > from >> > > >>>the >> > > >>> > >> >> >>programs >> > > >>> > >> >> >> behavior, everything looks good. If that is the >>case, the >> > > >>>error >> > > >>> > >> >>message >> > > >>> > >> >> >>we >> > > >>> > >> >> >> logged probably will just be ignored until some day >>when >> > > >>>people >> > > >>> > >>check >> > > >>> > >> >> >>the >> > > >>> > >> >> >> log and see it. >> > > >>> > >> >> >> >> > > >>> > >> >> >> As for the second option, because producer does not >>close >> > > >>>but >> > > >>> > >>blocks. >> > > >>> > >> >> >>User >> > > >>> > >> >> >> will notice this the first time they run the >>program. They >> > > >>> > >>probably >> > > >>> > >> >>will >> > > >>> > >> >> >> look at the log to see why producer could not be >>closed >> > and >> > > >>>they >> > > >>> > >>will >> > > >>> > >> >> >>see >> > > >>> > >> >> >> the error log we put there. So they will get informed >> > about >> > > >>>this >> > > >>> > >> >> >>mis-usage >> > > >>> > >> >> >> of close() in sender thread the first time they run >>the >> > code >> > > >>> > >>instead >> > > >>> > >> >>of >> > > >>> > >> >> >> some time later. >> > > >>> > >> >> >> >> > > >>> > >> >> >> Personally I prefer the second one because it is more >> > > >>>obvious >> > > >>> that >> > > >>> > >> >> >> something was wrong. >> > > >>> > >> >> >> >> > > >>> > >> >> >> Jiangjie (Becket) Qin >> > > >>> > >> >> >> >> > > >>> > >> >> >> On 3/15/15, 4:27 PM, "Guozhang Wang" >><wangg...@gmail.com> >> > > >>> wrote: >> > > >>> > >> >> >> >> > > >>> > >> >> >> >Yeah I agree we should not silently change the >>behavior >> > of >> > > >>>the >> > > >>> > >> >>function >> > > >>> > >> >> >> >with the given parameters; and I would prefer >> > > >>> > >> >> >>error-logging-and-shutdown >> > > >>> > >> >> >> >over blocking when close(>0) is used, since as Neha >> > > >>>suggested >> > > >>> > >> >>blocking >> > > >>> > >> >> >> >would also not proceed with sending any data, bu >>will >> > just >> > > >>>let >> > > >>> > >> >>users to >> > > >>> > >> >> >> >realize the issue later than sooner. >> > > >>> > >> >> >> > >> > > >>> > >> >> >> >On Sun, Mar 15, 2015 at 3:25 PM, Neha Narkhede >> > > >>> > >><n...@confluent.io> >> > > >>> > >> >> >>wrote: >> > > >>> > >> >> >> > >> > > >>> > >> >> >> >> > >> > > >>> > >> >> >> >> > And I also agree it is better if we can make >>producer >> > > >>>block >> > > >>> > >>when >> > > >>> > >> >> >> >> > close() is called from sender thread so user >>will >> > > >>>notice >> > > >>> > >> >>something >> > > >>> > >> >> >> >>went >> > > >>> > >> >> >> >> > wrong. >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >> This isn't a great experience either. Why can't >>we just >> > > >>>throw >> > > >>> > >>an >> > > >>> > >> >> >> >>exception >> > > >>> > >> >> >> >> for a behavior we know is incorrect and we'd like >>the >> > > >>>user to >> > > >>> > >> >>know. >> > > >>> > >> >> >> >> Blocking as a means of doing that seems wrong and >> > > >>>annoying. >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >> On Sun, Mar 15, 2015 at 11:56 AM, Jay Kreps >> > > >>> > >><jay.kr...@gmail.com> >> > > >>> > >> >> >> wrote: >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >> > Cool. >> > > >>> > >> >> >> >> > >> > > >>> > >> >> >> >> > I think blocking is good or alternately >>throwing an >> > > >>> exception >> > > >>> > >> >> >>directly >> > > >>> > >> >> >> >> from >> > > >>> > >> >> >> >> > close(). Basically I would just worry about >>subtly >> > > >>>doing >> > > >>> > >> >>something >> > > >>> > >> >> >> >> slightly >> > > >>> > >> >> >> >> > different from what the user asked for as it >>will be >> > > >>>hard >> > > >>> to >> > > >>> > >> >>notice >> > > >>> > >> >> >> >>that >> > > >>> > >> >> >> >> > behavior difference. >> > > >>> > >> >> >> >> > >> > > >>> > >> >> >> >> > -Jay >> > > >>> > >> >> >> >> > >> > > >>> > >> >> >> >> > On Sat, Mar 14, 2015 at 5:48 PM, Jiangjie Qin >> > > >>> > >> >> >> >><j...@linkedin.com.invalid >> > > >>> > >> >> >> >> > >> > > >>> > >> >> >> >> > wrote: >> > > >>> > >> >> >> >> > >> > > >>> > >> >> >> >> > > Hi Jay, >> > > >>> > >> >> >> >> > > >> > > >>> > >> >> >> >> > > I have modified the KIP as you suggested. I >>thinks >> > as >> > > >>> long >> > > >>> > >>as >> > > >>> > >> >>we >> > > >>> > >> >> >> >>have >> > > >>> > >> >> >> >> > > consistent define for timeout across Kafka >> > interface, >> > > >>> there >> > > >>> > >> >>would >> > > >>> > >> >> >> >>be no >> > > >>> > >> >> >> >> > > problem. And I also agree it is better if we >>can >> > make >> > > >>> > >>producer >> > > >>> > >> >> >>block >> > > >>> > >> >> >> >> when >> > > >>> > >> >> >> >> > > close() is called from sender thread so user >>will >> > > >>>notice >> > > >>> > >> >> >>something >> > > >>> > >> >> >> >>went >> > > >>> > >> >> >> >> > > wrong. >> > > >>> > >> >> >> >> > > >> > > >>> > >> >> >> >> > > Thanks. >> > > >>> > >> >> >> >> > > >> > > >>> > >> >> >> >> > > Jiangjie (Becket) Qin >> > > >>> > >> >> >> >> > > >> > > >>> > >> >> >> >> > > On 3/14/15, 11:37 AM, "Jay Kreps" >> > > >>><jay.kr...@gmail.com> >> > > >>> > >> wrote: >> > > >>> > >> >> >> >> > > >> > > >>> > >> >> >> >> > > >Hey Jiangjie, >> > > >>> > >> >> >> >> > > > >> > > >>> > >> >> >> >> > > >I think this is going to be very confusing >>that >> > > >>> > >> >> >> >> > > > close(0) waits indefinitely and >> > > >>> > >> >> >> >> > > > close(-1) waits for 0. >> > > >>> > >> >> >> >> > > >I understand this appears in other apis, but >>it >> > is a >> > > >>> > >>constant >> > > >>> > >> >> >> >>cause of >> > > >>> > >> >> >> >> > > >bugs. Let's not repeat that mistake. >> > > >>> > >> >> >> >> > > > >> > > >>> > >> >> >> >> > > >Let's make close(0) wait for 0. We don't >>need a >> > way >> > > >>>to >> > > >>> > >>wait >> > > >>> > >> >> >> >> indefinitely >> > > >>> > >> >> >> >> > > >as >> > > >>> > >> >> >> >> > > >we already have close() so having a magical >> > constant >> > > >>>for >> > > >>> > >> >>that is >> > > >>> > >> >> >> >> > > >redundant. >> > > >>> > >> >> >> >> > > > >> > > >>> > >> >> >> >> > > >Calling close() from the I/O thread was >>already >> > > >>>possible >> > > >>> > >>and >> > > >>> > >> >> >>would >> > > >>> > >> >> >> >> block >> > > >>> > >> >> >> >> > > >indefinitely. I think trying to silently >>change >> > the >> > > >>> > >>behavior >> > > >>> > >> >>is >> > > >>> > >> >> >> >> probably >> > > >>> > >> >> >> >> > > >not right. I.e. if the user calls close() in >>the >> > > >>> callback >> > > >>> > >> >>there >> > > >>> > >> >> >>is >> > > >>> > >> >> >> >> > > >actually >> > > >>> > >> >> >> >> > > >some misunderstanding and they need to think >>more, >> > > >>> > >>silently >> > > >>> > >> >> >>making >> > > >>> > >> >> >> >> this >> > > >>> > >> >> >> >> > > >not >> > > >>> > >> >> >> >> > > >block will hide the problem from them which >>is the >> > > >>> > >>opposite >> > > >>> > >> >>of >> > > >>> > >> >> >> >>what we >> > > >>> > >> >> >> >> > > >want. >> > > >>> > >> >> >> >> > > > >> > > >>> > >> >> >> >> > > >-Jay >> > > >>> > >> >> >> >> > > > >> > > >>> > >> >> >> >> > > >On Thu, Mar 12, 2015 at 1:49 AM, Jiangjie Qin >> > > >>> > >> >> >> >> <j...@linkedin.com.invalid >> > > >>> > >> >> >> >> > > >> > > >>> > >> >> >> >> > > >wrote: >> > > >>> > >> >> >> >> > > > >> > > >>> > >> >> >> >> > > >> Hey Joe & Jay, >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> Thanks for the comments on the voting >>thread. >> > > >>>Since it >> > > >>> > >> >>seems >> > > >>> > >> >> >>we >> > > >>> > >> >> >> >> > probably >> > > >>> > >> >> >> >> > > >> will have more discussion on this, I am >>just >> > > >>>replying >> > > >>> > >>from >> > > >>> > >> >>the >> > > >>> > >> >> >> >> > > >>discussion >> > > >>> > >> >> >> >> > > >> thread here. >> > > >>> > >> >> >> >> > > >> I’ve updated the KIP page to make it less >>like >> > > >>> > >>half-baked, >> > > >>> > >> >> >> >>apologize >> > > >>> > >> >> >> >> > for >> > > >>> > >> >> >> >> > > >> the rush... >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> The contract in current KIP is: >> > > >>> > >> >> >> >> > > >> 1. close() - wait until all requests >>either >> > are >> > > >>>sent >> > > >>> > >>or >> > > >>> > >> >> >>reach >> > > >>> > >> >> >> >> > request >> > > >>> > >> >> >> >> > > >> timeout. >> > > >>> > >> >> >> >> > > >> 2. close(-1, TimeUnit.MILLISECONDS) - >>close >> > > >>> > >>immediately >> > > >>> > >> >> >> >> > > >> 3. close(0, TimeUnit.MILLISECONDS) - >> > equivalent >> > > >>>to >> > > >>> > >> >>close(), >> > > >>> > >> >> >> >>i.e. >> > > >>> > >> >> >> >> > Wait >> > > >>> > >> >> >> >> > > >> until all requests are sent or reach >>request >> > > >>>timeout >> > > >>> > >> >> >> >> > > >> 4. close(5, TimeUnit.MILLISECONDS) - try >>the >> > > >>>best to >> > > >>> > >> >>finish >> > > >>> > >> >> >> >> sending >> > > >>> > >> >> >> >> > > >>in 5 >> > > >>> > >> >> >> >> > > >> milliseconds, if something went wrong, just >> > > >>>shutdown >> > > >>> the >> > > >>> > >> >> >>producer >> > > >>> > >> >> >> >> > > >>anyway, >> > > >>> > >> >> >> >> > > >> my callback will handle the failures. >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> About how we define what timeout value >>stands >> > for, >> > > >>>I >> > > >>> > >> >>actually >> > > >>> > >> >> >> >> > struggled >> > > >>> > >> >> >> >> > > >>a >> > > >>> > >> >> >> >> > > >> little bit when wrote the patch. >>Intuitively, >> > > >>>close(0) >> > > >>> > >> >>should >> > > >>> > >> >> >> >>mean >> > > >>> > >> >> >> >> > > >> immediately, however it seems that all the >> > > >>>existing >> > > >>> java >> > > >>> > >> >>class >> > > >>> > >> >> >> >>have >> > > >>> > >> >> >> >> > this >> > > >>> > >> >> >> >> > > >> convention of timeout=0 means no timeout or >> > never >> > > >>> > >>timeout >> > > >>> > >> >> >> >> > > >>(Thread.join(0), >> > > >>> > >> >> >> >> > > >> Object.wait(0), etc.) So here the dilemma >>is >> > > >>>either we >> > > >>> > >> >>follow >> > > >>> > >> >> >>the >> > > >>> > >> >> >> >> > > >> intuition or we follow the convention. >>What I >> > > >>>chose is >> > > >>> > >>to >> > > >>> > >> >> >>follow >> > > >>> > >> >> >> >>the >> > > >>> > >> >> >> >> > > >> convention but document the interface to >>let >> > user >> > > >>>be >> > > >>> > >>aware >> > > >>> > >> >>of >> > > >>> > >> >> >>the >> > > >>> > >> >> >> >> > usage. >> > > >>> > >> >> >> >> > > >> The reason is that I think >>producer.close() is a >> > > >>> public >> > > >>> > >> >> >> >>interface so >> > > >>> > >> >> >> >> > it >> > > >>> > >> >> >> >> > > >> might be better to follow java convention. >> > Whereas >> > > >>> > >> >>selector is >> > > >>> > >> >> >> >>not a >> > > >>> > >> >> >> >> > > >> public interface that used by user, so as >>long >> > as >> > > >>>it >> > > >>> > >>makes >> > > >>> > >> >> >>sense >> > > >>> > >> >> >> >>to >> > > >>> > >> >> >> >> > us, >> > > >>> > >> >> >> >> > > >>it >> > > >>> > >> >> >> >> > > >> is less a problem to be different from java >> > > >>> convention. >> > > >>> > >> >>That >> > > >>> > >> >> >>said >> > > >>> > >> >> >> >> > since >> > > >>> > >> >> >> >> > > >> consumer.poll(timeout) is also a public >> > interface, >> > > >>>I >> > > >>> > >>think >> > > >>> > >> >>it >> > > >>> > >> >> >> >>also >> > > >>> > >> >> >> >> > makes >> > > >>> > >> >> >> >> > > >> sense to make producer.close() to have the >>same >> > > >>> > >>definition >> > > >>> > >> >>of >> > > >>> > >> >> >> >> > > >> consumer.poll(timeout). >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> The main argument for keeping a timeout in >>close >> > > >>>would >> > > >>> > >>be >> > > >>> > >> >> >> >>separating >> > > >>> > >> >> >> >> > the >> > > >>> > >> >> >> >> > > >> close timeout from request timeout, which >> > probably >> > > >>> makes >> > > >>> > >> >> >>sense. I >> > > >>> > >> >> >> >> > would >> > > >>> > >> >> >> >> > > >> guess typically the request timeout would >>be >> > long >> > > >>> (e.g. >> > > >>> > >>60 >> > > >>> > >> >> >> >>seconds) >> > > >>> > >> >> >> >> > > >> because we might want to consider retries >>with >> > > >>>back >> > > >>> off >> > > >>> > >> >>time. >> > > >>> > >> >> >>If >> > > >>> > >> >> >> >>we >> > > >>> > >> >> >> >> > have >> > > >>> > >> >> >> >> > > >> multiple batches in accumulator, in worst >>case >> > > >>>that >> > > >>> > >>could >> > > >>> > >> >>take >> > > >>> > >> >> >> >>up to >> > > >>> > >> >> >> >> > > >> several minutes to complete all the >>requests. >> > But >> > > >>>when >> > > >>> > >>we >> > > >>> > >> >> >>close a >> > > >>> > >> >> >> >> > > >> producer, we might not want to wait for >>that >> > long >> > > >>>as >> > > >>> it >> > > >>> > >> >>might >> > > >>> > >> >> >> >>cause >> > > >>> > >> >> >> >> > some >> > > >>> > >> >> >> >> > > >> other problem like deployment tool timeout. >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> There is also a subtle difference between >> > > >>> close(timeout) >> > > >>> > >> >>and >> > > >>> > >> >> >> >> > > >> flush(timeout). The only purpose for >>flush() is >> > to >> > > >>> write >> > > >>> > >> >>data >> > > >>> > >> >> >>to >> > > >>> > >> >> >> >>the >> > > >>> > >> >> >> >> > > >> broker, so it makes perfect sense to wait >>until >> > > >>> request >> > > >>> > >> >> >>timeout. >> > > >>> > >> >> >> >>I >> > > >>> > >> >> >> >> > think >> > > >>> > >> >> >> >> > > >> that is why flush(timeout) looks strange. >>On the >> > > >>>other >> > > >>> > >> >>hand, >> > > >>> > >> >> >>the >> > > >>> > >> >> >> >>top >> > > >>> > >> >> >> >> > > >> priority for close() is to close the >>producer >> > > >>>rather >> > > >>> > >>than >> > > >>> > >> >> >>flush() >> > > >>> > >> >> >> >> > data, >> > > >>> > >> >> >> >> > > >>so >> > > >>> > >> >> >> >> > > >> close(timeout) gives guarantee on bounded >> > waiting >> > > >>>for >> > > >>> > >>its >> > > >>> > >> >>main >> > > >>> > >> >> >> >>job. >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> Sorry for the confusion about forceClose >>flag. >> > It >> > > >>>is >> > > >>> > >>not a >> > > >>> > >> >> >>public >> > > >>> > >> >> >> >> > > >> interface. I mentioned it in Proposed >>Changes >> > > >>>section >> > > >>> > >> >>which I >> > > >>> > >> >> >> >> thought >> > > >>> > >> >> >> >> > > >>was >> > > >>> > >> >> >> >> > > >> supposed to provide implementation details. >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> Thanks again for all the comments and >> > suggestions! >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> Jiangjie (Becket) Qin >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> On 3/10/15, 8:57 PM, "Jiangjie Qin" < >> > > >>> j...@linkedin.com> >> > > >>> > >> >> wrote: >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> >The KIP page has been updated per Jay¹s >> > comments. >> > > >>> > >> >> >> >> > > >> >I¹d like to initiate the voting process >>if no >> > > >>>further >> > > >>> > >> >> >>comments >> > > >>> > >> >> >> >>are >> > > >>> > >> >> >> >> > > >> >received by tomorrow. >> > > >>> > >> >> >> >> > > >> > >> > > >>> > >> >> >> >> > > >> >Jiangjie (Becket) Qin >> > > >>> > >> >> >> >> > > >> > >> > > >>> > >> >> >> >> > > >> >On 3/8/15, 9:45 AM, "Jay Kreps" >> > > >>><jay.kr...@gmail.com >> > > >>> > >> > > >>> > >> >>wrote: >> > > >>> > >> >> >> >> > > >> > >> > > >>> > >> >> >> >> > > >> >>Hey Jiangjie, >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> >>Can you capture the full motivation and >>use >> > > >>>cases >> > > >>> for >> > > >>> > >>the >> > > >>> > >> >> >> >>feature? >> > > >>> > >> >> >> >> > > >>This >> > > >>> > >> >> >> >> > > >> >>mentions your interest in having a way of >> > > >>>aborting >> > > >>> > >>from >> > > >>> > >> >> >>inside >> > > >>> > >> >> >> >>the >> > > >>> > >> >> >> >> > > >> >>Callback. But it doesn't really explain >>that >> > > >>>usage >> > > >>> or >> > > >>> > >>why >> > > >>> > >> >> >>other >> > > >>> > >> >> >> >> > people >> > > >>> > >> >> >> >> > > >> >>would want to do that. It also doesn't >>list >> > the >> > > >>> > >>primary >> > > >>> > >> >>use >> > > >>> > >> >> >> >>case >> > > >>> > >> >> >> >> for >> > > >>> > >> >> >> >> > > >> >>having >> > > >>> > >> >> >> >> > > >> >>close with a bounded timeout which was to >> > avoid >> > > >>> > >>blocking >> > > >>> > >> >>too >> > > >>> > >> >> >> >>long >> > > >>> > >> >> >> >> on >> > > >>> > >> >> >> >> > > >> >>shutdown. >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> >>-Jay >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> >>On Sat, Mar 7, 2015 at 12:25 PM, >>Jiangjie Qin >> > > >>> > >> >> >> >> > > >><j...@linkedin.com.invalid >> > > >>> > >> >> >> >> > > >> > >> > > >>> > >> >> >> >> > > >> >>wrote: >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> >>> Hi, >> > > >>> > >> >> >> >> > > >> >>> >> > > >>> > >> >> >> >> > > >> >>> I just created a KIP for adding a >> > > >>>close(timeout) >> > > >>> to >> > > >>> > >>new >> > > >>> > >> >> >> >> producer. >> > > >>> > >> >> >> >> > > >>Most >> > > >>> > >> >> >> >> > > >> >>>of >> > > >>> > >> >> >> >> > > >> >>> the previous discussions are in >>KAFKA-1660 >> > > >>>where >> > > >>> > >>Parth >> > > >>> > >> >> >> >> Brahmbhatt >> > > >>> > >> >> >> >> > > >>has >> > > >>> > >> >> >> >> > > >> >>> already done a lot of work. >> > > >>> > >> >> >> >> > > >> >>> Since this is an interface change so >>we are >> > > >>>going >> > > >>> > >> >>through >> > > >>> > >> >> >>the >> > > >>> > >> >> >> >> KIP >> > > >>> > >> >> >> >> > > >> >>>process. >> > > >>> > >> >> >> >> > > >> >>> Here is the KIP link: >> > > >>> > >> >> >> >> > > >> >>> >> > > >>> > >> >> >> >> > > >> >>> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >> > > >>> > >> >> >> > > >>> > >> >> > > >>> > >> > > >>> >> > > >>https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=5373978 >> > > >>> > >> >> >> >> > > >> >>>2 >> > > >>> > >> >> >> >> > > >> >>> >> > > >>> > >> >> >> >> > > >> >>> Thanks. >> > > >>> > >> >> >> >> > > >> >>> >> > > >>> > >> >> >> >> > > >> >>> Jiangjie (Becket) Qin >> > > >>> > >> >> >> >> > > >> >>> >> > > >>> > >> >> >> >> > > >> > >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> > > >>> > >> >> >> >> > > >> > > >>> > >> >> >> >> > >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >> -- >> > > >>> > >> >> >> >> Thanks, >> > > >>> > >> >> >> >> Neha >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> > >> > > >>> > >> >> >> > >> > > >>> > >> >> >> > >> > > >>> > >> >> >> >-- >> > > >>> > >> >> >> >-- Guozhang >> > > >>> > >> >> >> >> > > >>> > >> >> >> >> > > >>> > >> >> > >> > > >>> > >> >> > >> > > >>> > >> >> >-- >> > > >>> > >> >> >-- Guozhang >> > > >>> > >> >> >> > > >>> > >> >> >> > > >>> > >> > >> > > >>> > >> > >> > > >>> > >> >-- >> > > >>> > >> >-- Guozhang >> > > >>> > >> >> > > >>> > >> >> > > >>> > >> > > >>> > >> > > >>> >> > > >> >> > > >> >> > > >> >> > > >>-- >> > > >>-- Guozhang >> > > > >> > > >> > > >> > >> >> >> >> -- >> Thanks, >> Neha >