OK, we are going to follow (1) then. It looks the KIP does not need further updates. Should we resume the voting process?
On 3/19/15, 3:28 PM, "Jay Kreps" <jay.kr...@gmail.com> wrote: >Yeah guys, I think this is one where the cure is worse than the disease. >Let's just have the two close methods, that is confusing enough. Adding a >third won't help. I understand that having a few microseconds before the >thread shuts down could be unexpected but I think there is nothing >inherently wrong with that. > >-Jay > >On Thu, Mar 19, 2015 at 3:09 PM, Jiangjie Qin <j...@linkedin.com.invalid> >wrote: > >> Hi Neha, >> >> As Joel said, (1) and (2) provides almost exact same function as far as >>I >> can see. I actually have no strong preference. >> The only difference is that (2) provides the close() semantic meaning of >> making sure all the resources have gone away, at the cost of adding >> another abort() interface. >> The simplicity of (1) is also attractive but at a slight cost of >>clarity. >> I am slightly leaning towards (2) but would also be OK if we pursue (1). >> >> Thanks. >> >> Jiangjie (Becket) Qin >> >> >> On 3/19/15, 2:46 PM, "Joel Koshy" <jjkosh...@gmail.com> wrote: >> >> >(1) should work, but as Jun suggested earlier in the thread it is >> >slightly misleading. The (intuitive) post-condition of "close" is that >> >the producer has shutdown - i.e., its sender thread, closed its >> >metrics, serializer/deserializer, etc. That is not necessarily a >> >post-condition of "close(0)" although one can contend that if you call >> >the method in non-blocking mode (zero timeout) then it is reasonable >> >to not expect that post-condition. >> > >> >So I think that although (2) adds one more API it brings "simplicity >> >by virtue of overall clarity". >> > >> >I would be in favor of (2) but not strongly opposed to (1). >> > >> >Thanks, >> > >> >Joel >> > >> >On Thu, Mar 19, 2015 at 10:05:04AM -0700, Neha Narkhede wrote: >> >> I'm in favor of (1) for the sake of simplicity and as Jay mentions to >> >> reduce the number of different APIs. Can you explain when (1) does >>not >> >>work? >> >> >> >> On Wed, Mar 18, 2015 at 2:52 PM, Jay Kreps <jay.kr...@gmail.com> >>wrote: >> >> >> >> > Personally I'm in favor of (1) just to reduce the number of >>different >> >>APIs. >> >> > People will find the difference between abort and close subtle and >> >> > confusing and the only instance where you want it is this somewhat >> >>unusual >> >> > case you guys are pursuing, right? >> >> > >> >> > -Jay >> >> > >> >> > On Wed, Mar 18, 2015 at 2:13 PM, Jiangjie Qin >> >><j...@linkedin.com.invalid> >> >> > wrote: >> >> > >> >> > > It looks we have another option and are now deciding between the >> >> > following >> >> > > two interfaces: >> >> > > >> >> > > 1. Close() + close(timeout) >> >> > > - timeout could be either positive or zero. >> >> > > - only close(0) can be called from sender thread >> >> > > >> >> > > 2. Close() + abort() + close(timeout) >> >> > > - timeout can either be positive or zero >> >> > > - only abort() can be called from sender thread >> >> > > >> >> > > - abort() is equivalent to close(0) in 1) but does not join >>sender >> >> > > thread and does not close metrics. >> >> > > - Another thread has to call close() or close(timeout) in order >> >>to make >> >> > > sure the resources in producer are gone. >> >> > > >> >> > > The tow approach provides the same function we need, the >>difference >> >>is >> >> > > approach 2) follows convention of close() and abort(). On the >>other >> >>hand, >> >> > > approach 1) saves one interface compared with approach 2) but >>does >> >>not >> >> > > follow the convention. >> >> > > >> >> > > When the two approaches come to user code, it is probably >>something >> >>like >> >> > > this: >> >> > > >> >> > > Try { >> >> > > While(!finished) >> >> > > Producer.send(record, callback) >> >> > > } catch (Exception e) { >> >> > > Producer.close(5) >> >> > > } >> >> > > >> >> > > Class CallbackImpl implements Callback { >> >> > > onCompletion(RecordMetadata metadata Exception e) { >> >> > > If (e != null) >> >> > > Abort() / close() >> >> > > } >> >> > > } >> >> > > >> >> > > Because the two approach leads to almost the same user code, >> >>assuming >> >> > > users are always calling producer.close() as a clean up step, >> >>personally >> >> > I >> >> > > prefer approach 2) as it follows convention. >> >> > > >> >> > > Any thoughts? >> >> > > >> >> > > Jiangjie (Becket) Qin >> >> > > >> >> > > >> >> > > On 3/17/15, 10:25 AM, "Jiangjie Qin" <j...@linkedin.com> wrote: >> >> > > >> >> > > >Hi Jun, >> >> > > > >> >> > > >Yes, as Guozhang said, the main reason we set a flag is because >> >>close(0) >> >> > > >is expected to be called by sender thread itself. >> >> > > >If we want to maintain the semantic meaning of close(), one >> >>alternative >> >> > is >> >> > > >to have an abort() method does the same thing as close(0) except >> >> > cleanup. >> >> > > >And in close(timeout), after timeout we call abort() and join >>the >> >>sender >> >> > > >thread. This was one of the previous proposal. We merged abort >>to >> >> > close(0) >> >> > > >because they are almost doing the same thing. But from what you >> >> > mentioned, >> >> > > >it might make sense to have two separate methods. >> >> > > > >> >> > > >Thanks. >> >> > > > >> >> > > >Jiangjie (Becket) Qin >> >> > > > >> >> > > >On 3/16/15, 10:31 PM, "Guozhang Wang" <wangg...@gmail.com> >>wrote: >> >> > > > >> >> > > >>Yeah in this sense the sender thread will not exist immediately >> >>in the >> >> > > >>close(0) call, but will only terminate after the current >>response >> >>batch >> >> > > >>has >> >> > > >>been processed, as will the producer instance itself. >> >> > > >> >> >> > > >>There is a reason for this though: for a clean shutdown the >>caller >> >> > thread >> >> > > >>has to wait for the sender thread to join before closing the >> >>producer >> >> > > >>instance, but this cannot be achieve if close(0) is called by >>the >> >> > sender >> >> > > >>thread itself (for example in KAFKA-1659, there is a proposal >>from >> >> > Andrew >> >> > > >>Stein on using thread.interrupt and thread.stop, but if it is >> >>called by >> >> > > >>the >> >> > > >>ioThread itself the stop call will fail). Hence we came up with >> >>the >> >> > flag >> >> > > >>approach to let the sender thread to close as soon as it is at >>the >> >> > > >>barrier >> >> > > >>of the run loop. >> >> > > >> >> >> > > >>Guozhang >> >> > > >> >> >> > > >>On Mon, Mar 16, 2015 at 9:41 PM, Jun Rao <j...@confluent.io> >> wrote: >> >> > > >> >> >> > > >>> Hmm, does that mean that after close(0), the sender thread is >> >>not >> >> > > >>>necessary >> >> > > >>> gone? Normally, after closing an entity, we expect all >>internal >> >> > threads >> >> > > >>> associated with the entity are shut down completely. >> >> > > >>> >> >> > > >>> Thanks, >> >> > > >>> >> >> > > >>> Jun >> >> > > >>> >> >> > > >>> On Mon, Mar 16, 2015 at 3:18 PM, Jiangjie Qin >> >> > > >>><j...@linkedin.com.invalid> >> >> > > >>> wrote: >> >> > > >>> >> >> > > >>> > Hi Jun, >> >> > > >>> > >> >> > > >>> > Close(0) will set two flags in sender. Running=false and a >> >>newly >> >> > > >>>added >> >> > > >>> > forceClose=true. It will also set accumulator.closed=true >>so >> >>no >> >> > > >>>further >> >> > > >>> > producer.send() will succeed. >> >> > > >>> > The sender thread will finish executing all the callbacks >>in >> >> > current >> >> > > >>> batch >> >> > > >>> > of responses, then it will see the forceClose flag. It will >> >>just >> >> > fail >> >> > > >>>all >> >> > > >>> > the incomplete batches in the producer and exit. >> >> > > >>> > So close(0) is a non-blocking call and sender thread will >>not >> >>try >> >> > to >> >> > > >>>join >> >> > > >>> > itself in close(0). >> >> > > >>> > >> >> > > >>> > Thanks. >> >> > > >>> > >> >> > > >>> > Jiangjie (Becket) Qin >> >> > > >>> > >> >> > > >>> > On 3/16/15, 2:50 PM, "Jun Rao" <j...@confluent.io> wrote: >> >> > > >>> > >> >> > > >>> > >How does close(0) work if it's called from the sender >> >>thread? If >> >> > > >>> close(0) >> >> > > >>> > >needs to wait for the sender thread to join, wouldn't this >> >>cause a >> >> > > >>> > >deadlock? >> >> > > >>> > > >> >> > > >>> > >Thanks, >> >> > > >>> > > >> >> > > >>> > >Jun >> >> > > >>> > > >> >> > > >>> > >On Mon, Mar 16, 2015 at 2:26 PM, Jiangjie Qin >> >> > > >>><j...@linkedin.com.invalid >> >> > > >>> > >> >> > > >>> > >wrote: >> >> > > >>> > > >> >> > > >>> > >> Thanks Guozhang. It wouldn’t be as thoroughly considered >> >>without >> >> > > >>> > >> discussing with you :) >> >> > > >>> > >> >> >> > > >>> > >> Jiangjie (Becket) Qin >> >> > > >>> > >> >> >> > > >>> > >> On 3/16/15, 1:07 PM, "Guozhang Wang" >><wangg...@gmail.com> >> >> > wrote: >> >> > > >>> > >> >> >> > > >>> > >> >Thanks Jiangjie, >> >> > > >>> > >> > >> >> > > >>> > >> >After talking to you offline on this, I have been >> >>convinced and >> >> > > >>> > >>changed my >> >> > > >>> > >> >preference to blocking. The immediate shutdown approach >> >>does >> >> > have >> >> > > >>> some >> >> > > >>> > >> >unsafeness in some cases. >> >> > > >>> > >> > >> >> > > >>> > >> >Guozhang >> >> > > >>> > >> > >> >> > > >>> > >> >On Mon, Mar 16, 2015 at 11:50 AM, Jiangjie Qin >> >> > > >>> > >><j...@linkedin.com.invalid >> >> > > >>> > >> > >> >> > > >>> > >> >wrote: >> >> > > >>> > >> > >> >> > > >>> > >> >> It looks that the problem we want to solve and the >> >>purpose we >> >> > > >>>want >> >> > > >>> to >> >> > > >>> > >> >> achieve is: >> >> > > >>> > >> >> If user uses close() in callback, we want to let >>user be >> >> > aware >> >> > > >>>that >> >> > > >>> > >>they >> >> > > >>> > >> >> should use close(0) instead of close() in the >>callback. >> >> > > >>> > >> >> >> >> > > >>> > >> >> We have agreed that we will have an error log to >>inform >> >>user >> >> > > >>>about >> >> > > >>> > >>this >> >> > > >>> > >> >> mis-usage. The options differ in the way how we can >> >>force >> >> > user >> >> > > >>>to >> >> > > >>> > >>take a >> >> > > >>> > >> >> look at that error log. >> >> > > >>> > >> >> There are two scenarios: >> >> > > >>> > >> >> 1. User does not expect the program to exit. >> >> > > >>> > >> >> 2. User expect the program to exit. >> >> > > >>> > >> >> >> >> > > >>> > >> >> For scenario 1), blocking will probably delay the >> >>discovery >> >> > of >> >> > > >>>the >> >> > > >>> > >> >> problem. Calling close(0) exposes the problem >>quicker. >> >>In >> >> > this >> >> > > >>> > >>scenario >> >> > > >>> > >> >> producer just encounter a send failure when running >> >>normally. >> >> > > >>> > >> >> For scenario 2), blocking will expose the problem >>quick. >> >> > > >>>Calling >> >> > > >>> > >> >>close(-1) >> >> > > >>> > >> >> might hide the problem. This scenario might include: >>a) >> >>Unit >> >> > > >>>test >> >> > > >>> > >>for a >> >> > > >>> > >> >> send failure. b) Message sending during a close() >>call >> >>from a >> >> > > >>>user >> >> > > >>> > >> >>thread. >> >> > > >>> > >> >> >> >> > > >>> > >> >> So as a summary table: >> >> > > >>> > >> >> >> >> > > >>> > >> >> Scenario 1) >> >> > Scenario >> >> > > >>>2) >> >> > > >>> > >> >> >> >> > > >>> > >> >> Blocking Delay problem discovery >>Guaranteed >> >> > > >>>problem >> >> > > >>> > >> >>discovery >> >> > > >>> > >> >> >> >> > > >>> > >> >> Close(-1) Immediate problem discovery Problem >> >>might >> >> > be >> >> > > >>> hidden >> >> > > >>> > >> >> >> >> > > >>> > >> >> >> >> > > >>> > >> >> Personally I prefer blocking because it seems >>providing >> >>more >> >> > > >>> > >>guarantees >> >> > > >>> > >> >> and safer. >> >> > > >>> > >> >> >> >> > > >>> > >> >> Thanks. >> >> > > >>> > >> >> >> >> > > >>> > >> >> Jiangjie (Becket) Qin >> >> > > >>> > >> >> >> >> > > >>> > >> >> >> >> > > >>> > >> >> On 3/16/15, 10:11 AM, "Guozhang Wang" >> >><wangg...@gmail.com> >> >> > > >>>wrote: >> >> > > >>> > >> >> >> >> > > >>> > >> >> >HI Jiangjie, >> >> > > >>> > >> >> > >> >> > > >>> > >> >> >As far as I understand calling close() in the >>ioThread >> >>is >> >> > not >> >> > > >>> > >>common, >> >> > > >>> > >> >>as >> >> > > >>> > >> >> >it >> >> > > >>> > >> >> >may only trigger when we saw some non-retriable >>error. >> >>Hence >> >> > > >>>when >> >> > > >>> > >>user >> >> > > >>> > >> >>run >> >> > > >>> > >> >> >their program it is unlikely that close() will be >> >>triggered >> >> > > >>>and >> >> > > >>> > >>problem >> >> > > >>> > >> >> >will be detected. So it seems to me that from the >>error >> >> > > >>>detection >> >> > > >>> > >> >>aspect >> >> > > >>> > >> >> >these two options seems to be the same as people >>will >> >> > usually >> >> > > >>> > >>detect it >> >> > > >>> > >> >> >from the producer metrics all dropping to 0. >> >> > > >>> > >> >> > >> >> > > >>> > >> >> >Guozhang >> >> > > >>> > >> >> > >> >> > > >>> > >> >> >On Mon, Mar 16, 2015 at 9:52 AM, Jiangjie Qin >> >> > > >>> > >> >><j...@linkedin.com.invalid> >> >> > > >>> > >> >> >wrote: >> >> > > >>> > >> >> > >> >> > > >>> > >> >> >> It seems there are two options we can choose from >> >>when >> >> > > >>>close() >> >> > > >>> is >> >> > > >>> > >> >>called >> >> > > >>> > >> >> >> from sender thread (callback): >> >> > > >>> > >> >> >> 1. Log an error and close the producer using >> >>close(-1) >> >> > > >>> > >> >> >> 2. Log an error and block. >> >> > > >>> > >> >> >> (Throwing an exception will not work because we >> >>catch all >> >> > > >>>the >> >> > > >>> > >> >>exception >> >> > > >>> > >> >> >> thrown from user callback. It will just lead to an >> >>error >> >> > > >>>log.) >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> My concern for the first option is that the >>producer >> >>will >> >> > be >> >> > > >>> > >>closed >> >> > > >>> > >> >>even >> >> > > >>> > >> >> >> if we logged and error. I am wondering if some >>user >> >>would >> >> > > >>>not >> >> > > >>> even >> >> > > >>> > >> >>take >> >> > > >>> > >> >> >>a >> >> > > >>> > >> >> >> look at the log if producer is closed normally. >> >>Because >> >> > from >> >> > > >>>the >> >> > > >>> > >> >> >>programs >> >> > > >>> > >> >> >> behavior, everything looks good. If that is the >> >>case, the >> >> > > >>>error >> >> > > >>> > >> >>message >> >> > > >>> > >> >> >>we >> >> > > >>> > >> >> >> logged probably will just be ignored until some >>day >> >>when >> >> > > >>>people >> >> > > >>> > >>check >> >> > > >>> > >> >> >>the >> >> > > >>> > >> >> >> log and see it. >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> As for the second option, because producer does >>not >> >>close >> >> > > >>>but >> >> > > >>> > >>blocks. >> >> > > >>> > >> >> >>User >> >> > > >>> > >> >> >> will notice this the first time they run the >> >>program. They >> >> > > >>> > >>probably >> >> > > >>> > >> >>will >> >> > > >>> > >> >> >> look at the log to see why producer could not be >> >>closed >> >> > and >> >> > > >>>they >> >> > > >>> > >>will >> >> > > >>> > >> >> >>see >> >> > > >>> > >> >> >> the error log we put there. So they will get >>informed >> >> > about >> >> > > >>>this >> >> > > >>> > >> >> >>mis-usage >> >> > > >>> > >> >> >> of close() in sender thread the first time they >>run >> >>the >> >> > code >> >> > > >>> > >>instead >> >> > > >>> > >> >>of >> >> > > >>> > >> >> >> some time later. >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> Personally I prefer the second one because it is >>more >> >> > > >>>obvious >> >> > > >>> that >> >> > > >>> > >> >> >> something was wrong. >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> Jiangjie (Becket) Qin >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> On 3/15/15, 4:27 PM, "Guozhang Wang" >> >><wangg...@gmail.com> >> >> > > >>> wrote: >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >Yeah I agree we should not silently change the >> >>behavior >> >> > of >> >> > > >>>the >> >> > > >>> > >> >>function >> >> > > >>> > >> >> >> >with the given parameters; and I would prefer >> >> > > >>> > >> >> >>error-logging-and-shutdown >> >> > > >>> > >> >> >> >over blocking when close(>0) is used, since as >>Neha >> >> > > >>>suggested >> >> > > >>> > >> >>blocking >> >> > > >>> > >> >> >> >would also not proceed with sending any data, bu >> >>will >> >> > just >> >> > > >>>let >> >> > > >>> > >> >>users to >> >> > > >>> > >> >> >> >realize the issue later than sooner. >> >> > > >>> > >> >> >> > >> >> > > >>> > >> >> >> >On Sun, Mar 15, 2015 at 3:25 PM, Neha Narkhede >> >> > > >>> > >><n...@confluent.io> >> >> > > >>> > >> >> >>wrote: >> >> > > >>> > >> >> >> > >> >> > > >>> > >> >> >> >> > >> >> > > >>> > >> >> >> >> > And I also agree it is better if we can make >> >>producer >> >> > > >>>block >> >> > > >>> > >>when >> >> > > >>> > >> >> >> >> > close() is called from sender thread so user >> >>will >> >> > > >>>notice >> >> > > >>> > >> >>something >> >> > > >>> > >> >> >> >>went >> >> > > >>> > >> >> >> >> > wrong. >> >> > > >>> > >> >> >> >> >> >> > > >>> > >> >> >> >> >> >> > > >>> > >> >> >> >> This isn't a great experience either. Why can't >> >>we just >> >> > > >>>throw >> >> > > >>> > >>an >> >> > > >>> > >> >> >> >>exception >> >> > > >>> > >> >> >> >> for a behavior we know is incorrect and we'd >>like >> >>the >> >> > > >>>user to >> >> > > >>> > >> >>know. >> >> > > >>> > >> >> >> >> Blocking as a means of doing that seems wrong >>and >> >> > > >>>annoying. >> >> > > >>> > >> >> >> >> >> >> > > >>> > >> >> >> >> On Sun, Mar 15, 2015 at 11:56 AM, Jay Kreps >> >> > > >>> > >><jay.kr...@gmail.com> >> >> > > >>> > >> >> >> wrote: >> >> > > >>> > >> >> >> >> >> >> > > >>> > >> >> >> >> > Cool. >> >> > > >>> > >> >> >> >> > >> >> > > >>> > >> >> >> >> > I think blocking is good or alternately >> >>throwing an >> >> > > >>> exception >> >> > > >>> > >> >> >>directly >> >> > > >>> > >> >> >> >> from >> >> > > >>> > >> >> >> >> > close(). Basically I would just worry about >> >>subtly >> >> > > >>>doing >> >> > > >>> > >> >>something >> >> > > >>> > >> >> >> >> slightly >> >> > > >>> > >> >> >> >> > different from what the user asked for as it >> >>will be >> >> > > >>>hard >> >> > > >>> to >> >> > > >>> > >> >>notice >> >> > > >>> > >> >> >> >>that >> >> > > >>> > >> >> >> >> > behavior difference. >> >> > > >>> > >> >> >> >> > >> >> > > >>> > >> >> >> >> > -Jay >> >> > > >>> > >> >> >> >> > >> >> > > >>> > >> >> >> >> > On Sat, Mar 14, 2015 at 5:48 PM, Jiangjie Qin >> >> > > >>> > >> >> >> >><j...@linkedin.com.invalid >> >> > > >>> > >> >> >> >> > >> >> > > >>> > >> >> >> >> > wrote: >> >> > > >>> > >> >> >> >> > >> >> > > >>> > >> >> >> >> > > Hi Jay, >> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > I have modified the KIP as you suggested. I >> >>thinks >> >> > as >> >> > > >>> long >> >> > > >>> > >>as >> >> > > >>> > >> >>we >> >> > > >>> > >> >> >> >>have >> >> > > >>> > >> >> >> >> > > consistent define for timeout across Kafka >> >> > interface, >> >> > > >>> there >> >> > > >>> > >> >>would >> >> > > >>> > >> >> >> >>be no >> >> > > >>> > >> >> >> >> > > problem. And I also agree it is better if >>we >> >>can >> >> > make >> >> > > >>> > >>producer >> >> > > >>> > >> >> >>block >> >> > > >>> > >> >> >> >> when >> >> > > >>> > >> >> >> >> > > close() is called from sender thread so >>user >> >>will >> >> > > >>>notice >> >> > > >>> > >> >> >>something >> >> > > >>> > >> >> >> >>went >> >> > > >>> > >> >> >> >> > > wrong. >> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > Thanks. >> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > Jiangjie (Becket) Qin >> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > On 3/14/15, 11:37 AM, "Jay Kreps" >> >> > > >>><jay.kr...@gmail.com> >> >> > > >>> > >> wrote: >> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >Hey Jiangjie, >> >> > > >>> > >> >> >> >> > > > >> >> > > >>> > >> >> >> >> > > >I think this is going to be very confusing >> >>that >> >> > > >>> > >> >> >> >> > > > close(0) waits indefinitely and >> >> > > >>> > >> >> >> >> > > > close(-1) waits for 0. >> >> > > >>> > >> >> >> >> > > >I understand this appears in other apis, >>but >> >>it >> >> > is a >> >> > > >>> > >>constant >> >> > > >>> > >> >> >> >>cause of >> >> > > >>> > >> >> >> >> > > >bugs. Let's not repeat that mistake. >> >> > > >>> > >> >> >> >> > > > >> >> > > >>> > >> >> >> >> > > >Let's make close(0) wait for 0. We don't >> >>need a >> >> > way >> >> > > >>>to >> >> > > >>> > >>wait >> >> > > >>> > >> >> >> >> indefinitely >> >> > > >>> > >> >> >> >> > > >as >> >> > > >>> > >> >> >> >> > > >we already have close() so having a >>magical >> >> > constant >> >> > > >>>for >> >> > > >>> > >> >>that is >> >> > > >>> > >> >> >> >> > > >redundant. >> >> > > >>> > >> >> >> >> > > > >> >> > > >>> > >> >> >> >> > > >Calling close() from the I/O thread was >> >>already >> >> > > >>>possible >> >> > > >>> > >>and >> >> > > >>> > >> >> >>would >> >> > > >>> > >> >> >> >> block >> >> > > >>> > >> >> >> >> > > >indefinitely. I think trying to silently >> >>change >> >> > the >> >> > > >>> > >>behavior >> >> > > >>> > >> >>is >> >> > > >>> > >> >> >> >> probably >> >> > > >>> > >> >> >> >> > > >not right. I.e. if the user calls close() >>in >> >>the >> >> > > >>> callback >> >> > > >>> > >> >>there >> >> > > >>> > >> >> >>is >> >> > > >>> > >> >> >> >> > > >actually >> >> > > >>> > >> >> >> >> > > >some misunderstanding and they need to >>think >> >>more, >> >> > > >>> > >>silently >> >> > > >>> > >> >> >>making >> >> > > >>> > >> >> >> >> this >> >> > > >>> > >> >> >> >> > > >not >> >> > > >>> > >> >> >> >> > > >block will hide the problem from them >>which >> >>is the >> >> > > >>> > >>opposite >> >> > > >>> > >> >>of >> >> > > >>> > >> >> >> >>what we >> >> > > >>> > >> >> >> >> > > >want. >> >> > > >>> > >> >> >> >> > > > >> >> > > >>> > >> >> >> >> > > >-Jay >> >> > > >>> > >> >> >> >> > > > >> >> > > >>> > >> >> >> >> > > >On Thu, Mar 12, 2015 at 1:49 AM, Jiangjie >>Qin >> >> > > >>> > >> >> >> >> <j...@linkedin.com.invalid >> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >wrote: >> >> > > >>> > >> >> >> >> > > > >> >> > > >>> > >> >> >> >> > > >> Hey Joe & Jay, >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> Thanks for the comments on the voting >> >>thread. >> >> > > >>>Since it >> >> > > >>> > >> >>seems >> >> > > >>> > >> >> >>we >> >> > > >>> > >> >> >> >> > probably >> >> > > >>> > >> >> >> >> > > >> will have more discussion on this, I am >> >>just >> >> > > >>>replying >> >> > > >>> > >>from >> >> > > >>> > >> >>the >> >> > > >>> > >> >> >> >> > > >>discussion >> >> > > >>> > >> >> >> >> > > >> thread here. >> >> > > >>> > >> >> >> >> > > >> I’ve updated the KIP page to make it >>less >> >>like >> >> > > >>> > >>half-baked, >> >> > > >>> > >> >> >> >>apologize >> >> > > >>> > >> >> >> >> > for >> >> > > >>> > >> >> >> >> > > >> the rush... >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> The contract in current KIP is: >> >> > > >>> > >> >> >> >> > > >> 1. close() - wait until all requests >> >>either >> >> > are >> >> > > >>>sent >> >> > > >>> > >>or >> >> > > >>> > >> >> >>reach >> >> > > >>> > >> >> >> >> > request >> >> > > >>> > >> >> >> >> > > >> timeout. >> >> > > >>> > >> >> >> >> > > >> 2. close(-1, TimeUnit.MILLISECONDS) - >> >>close >> >> > > >>> > >>immediately >> >> > > >>> > >> >> >> >> > > >> 3. close(0, TimeUnit.MILLISECONDS) - >> >> > equivalent >> >> > > >>>to >> >> > > >>> > >> >>close(), >> >> > > >>> > >> >> >> >>i.e. >> >> > > >>> > >> >> >> >> > Wait >> >> > > >>> > >> >> >> >> > > >> until all requests are sent or reach >> >>request >> >> > > >>>timeout >> >> > > >>> > >> >> >> >> > > >> 4. close(5, TimeUnit.MILLISECONDS) - >>try >> >>the >> >> > > >>>best to >> >> > > >>> > >> >>finish >> >> > > >>> > >> >> >> >> sending >> >> > > >>> > >> >> >> >> > > >>in 5 >> >> > > >>> > >> >> >> >> > > >> milliseconds, if something went wrong, >>just >> >> > > >>>shutdown >> >> > > >>> the >> >> > > >>> > >> >> >>producer >> >> > > >>> > >> >> >> >> > > >>anyway, >> >> > > >>> > >> >> >> >> > > >> my callback will handle the failures. >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> About how we define what timeout value >> >>stands >> >> > for, >> >> > > >>>I >> >> > > >>> > >> >>actually >> >> > > >>> > >> >> >> >> > struggled >> >> > > >>> > >> >> >> >> > > >>a >> >> > > >>> > >> >> >> >> > > >> little bit when wrote the patch. >> >>Intuitively, >> >> > > >>>close(0) >> >> > > >>> > >> >>should >> >> > > >>> > >> >> >> >>mean >> >> > > >>> > >> >> >> >> > > >> immediately, however it seems that all >>the >> >> > > >>>existing >> >> > > >>> java >> >> > > >>> > >> >>class >> >> > > >>> > >> >> >> >>have >> >> > > >>> > >> >> >> >> > this >> >> > > >>> > >> >> >> >> > > >> convention of timeout=0 means no >>timeout or >> >> > never >> >> > > >>> > >>timeout >> >> > > >>> > >> >> >> >> > > >>(Thread.join(0), >> >> > > >>> > >> >> >> >> > > >> Object.wait(0), etc.) So here the >>dilemma >> >>is >> >> > > >>>either we >> >> > > >>> > >> >>follow >> >> > > >>> > >> >> >>the >> >> > > >>> > >> >> >> >> > > >> intuition or we follow the convention. >> >>What I >> >> > > >>>chose is >> >> > > >>> > >>to >> >> > > >>> > >> >> >>follow >> >> > > >>> > >> >> >> >>the >> >> > > >>> > >> >> >> >> > > >> convention but document the interface to >> >>let >> >> > user >> >> > > >>>be >> >> > > >>> > >>aware >> >> > > >>> > >> >>of >> >> > > >>> > >> >> >>the >> >> > > >>> > >> >> >> >> > usage. >> >> > > >>> > >> >> >> >> > > >> The reason is that I think >> >>producer.close() is a >> >> > > >>> public >> >> > > >>> > >> >> >> >>interface so >> >> > > >>> > >> >> >> >> > it >> >> > > >>> > >> >> >> >> > > >> might be better to follow java >>convention. >> >> > Whereas >> >> > > >>> > >> >>selector is >> >> > > >>> > >> >> >> >>not a >> >> > > >>> > >> >> >> >> > > >> public interface that used by user, so >>as >> >>long >> >> > as >> >> > > >>>it >> >> > > >>> > >>makes >> >> > > >>> > >> >> >>sense >> >> > > >>> > >> >> >> >>to >> >> > > >>> > >> >> >> >> > us, >> >> > > >>> > >> >> >> >> > > >>it >> >> > > >>> > >> >> >> >> > > >> is less a problem to be different from >>java >> >> > > >>> convention. >> >> > > >>> > >> >>That >> >> > > >>> > >> >> >>said >> >> > > >>> > >> >> >> >> > since >> >> > > >>> > >> >> >> >> > > >> consumer.poll(timeout) is also a public >> >> > interface, >> >> > > >>>I >> >> > > >>> > >>think >> >> > > >>> > >> >>it >> >> > > >>> > >> >> >> >>also >> >> > > >>> > >> >> >> >> > makes >> >> > > >>> > >> >> >> >> > > >> sense to make producer.close() to have >>the >> >>same >> >> > > >>> > >>definition >> >> > > >>> > >> >>of >> >> > > >>> > >> >> >> >> > > >> consumer.poll(timeout). >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> The main argument for keeping a timeout >>in >> >>close >> >> > > >>>would >> >> > > >>> > >>be >> >> > > >>> > >> >> >> >>separating >> >> > > >>> > >> >> >> >> > the >> >> > > >>> > >> >> >> >> > > >> close timeout from request timeout, >>which >> >> > probably >> >> > > >>> makes >> >> > > >>> > >> >> >>sense. I >> >> > > >>> > >> >> >> >> > would >> >> > > >>> > >> >> >> >> > > >> guess typically the request timeout >>would >> >>be >> >> > long >> >> > > >>> (e.g. >> >> > > >>> > >>60 >> >> > > >>> > >> >> >> >>seconds) >> >> > > >>> > >> >> >> >> > > >> because we might want to consider >>retries >> >>with >> >> > > >>>back >> >> > > >>> off >> >> > > >>> > >> >>time. >> >> > > >>> > >> >> >>If >> >> > > >>> > >> >> >> >>we >> >> > > >>> > >> >> >> >> > have >> >> > > >>> > >> >> >> >> > > >> multiple batches in accumulator, in >>worst >> >>case >> >> > > >>>that >> >> > > >>> > >>could >> >> > > >>> > >> >>take >> >> > > >>> > >> >> >> >>up to >> >> > > >>> > >> >> >> >> > > >> several minutes to complete all the >> >>requests. >> >> > But >> >> > > >>>when >> >> > > >>> > >>we >> >> > > >>> > >> >> >>close a >> >> > > >>> > >> >> >> >> > > >> producer, we might not want to wait for >> >>that >> >> > long >> >> > > >>>as >> >> > > >>> it >> >> > > >>> > >> >>might >> >> > > >>> > >> >> >> >>cause >> >> > > >>> > >> >> >> >> > some >> >> > > >>> > >> >> >> >> > > >> other problem like deployment tool >>timeout. >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> There is also a subtle difference >>between >> >> > > >>> close(timeout) >> >> > > >>> > >> >>and >> >> > > >>> > >> >> >> >> > > >> flush(timeout). The only purpose for >> >>flush() is >> >> > to >> >> > > >>> write >> >> > > >>> > >> >>data >> >> > > >>> > >> >> >>to >> >> > > >>> > >> >> >> >>the >> >> > > >>> > >> >> >> >> > > >> broker, so it makes perfect sense to >>wait >> >>until >> >> > > >>> request >> >> > > >>> > >> >> >>timeout. >> >> > > >>> > >> >> >> >>I >> >> > > >>> > >> >> >> >> > think >> >> > > >>> > >> >> >> >> > > >> that is why flush(timeout) looks >>strange. >> >>On the >> >> > > >>>other >> >> > > >>> > >> >>hand, >> >> > > >>> > >> >> >>the >> >> > > >>> > >> >> >> >>top >> >> > > >>> > >> >> >> >> > > >> priority for close() is to close the >> >>producer >> >> > > >>>rather >> >> > > >>> > >>than >> >> > > >>> > >> >> >>flush() >> >> > > >>> > >> >> >> >> > data, >> >> > > >>> > >> >> >> >> > > >>so >> >> > > >>> > >> >> >> >> > > >> close(timeout) gives guarantee on >>bounded >> >> > waiting >> >> > > >>>for >> >> > > >>> > >>its >> >> > > >>> > >> >>main >> >> > > >>> > >> >> >> >>job. >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> Sorry for the confusion about forceClose >> >>flag. >> >> > It >> >> > > >>>is >> >> > > >>> > >>not a >> >> > > >>> > >> >> >>public >> >> > > >>> > >> >> >> >> > > >> interface. I mentioned it in Proposed >> >>Changes >> >> > > >>>section >> >> > > >>> > >> >>which I >> >> > > >>> > >> >> >> >> thought >> >> > > >>> > >> >> >> >> > > >>was >> >> > > >>> > >> >> >> >> > > >> supposed to provide implementation >>details. >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> Thanks again for all the comments and >> >> > suggestions! >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> Jiangjie (Becket) Qin >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> On 3/10/15, 8:57 PM, "Jiangjie Qin" < >> >> > > >>> j...@linkedin.com> >> >> > > >>> > >> >> wrote: >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> >The KIP page has been updated per Jay¹s >> >> > comments. >> >> > > >>> > >> >> >> >> > > >> >I¹d like to initiate the voting process >> >>if no >> >> > > >>>further >> >> > > >>> > >> >> >>comments >> >> > > >>> > >> >> >> >>are >> >> > > >>> > >> >> >> >> > > >> >received by tomorrow. >> >> > > >>> > >> >> >> >> > > >> > >> >> > > >>> > >> >> >> >> > > >> >Jiangjie (Becket) Qin >> >> > > >>> > >> >> >> >> > > >> > >> >> > > >>> > >> >> >> >> > > >> >On 3/8/15, 9:45 AM, "Jay Kreps" >> >> > > >>><jay.kr...@gmail.com >> >> > > >>> > >> >> > > >>> > >> >>wrote: >> >> > > >>> > >> >> >> >> > > >> > >> >> > > >>> > >> >> >> >> > > >> >>Hey Jiangjie, >> >> > > >>> > >> >> >> >> > > >> >> >> >> > > >>> > >> >> >> >> > > >> >>Can you capture the full motivation >>and >> >>use >> >> > > >>>cases >> >> > > >>> for >> >> > > >>> > >>the >> >> > > >>> > >> >> >> >>feature? >> >> > > >>> > >> >> >> >> > > >>This >> >> > > >>> > >> >> >> >> > > >> >>mentions your interest in having a >>way of >> >> > > >>>aborting >> >> > > >>> > >>from >> >> > > >>> > >> >> >>inside >> >> > > >>> > >> >> >> >>the >> >> > > >>> > >> >> >> >> > > >> >>Callback. But it doesn't really >>explain >> >>that >> >> > > >>>usage >> >> > > >>> or >> >> > > >>> > >>why >> >> > > >>> > >> >> >>other >> >> > > >>> > >> >> >> >> > people >> >> > > >>> > >> >> >> >> > > >> >>would want to do that. It also doesn't >> >>list >> >> > the >> >> > > >>> > >>primary >> >> > > >>> > >> >>use >> >> > > >>> > >> >> >> >>case >> >> > > >>> > >> >> >> >> for >> >> > > >>> > >> >> >> >> > > >> >>having >> >> > > >>> > >> >> >> >> > > >> >>close with a bounded timeout which >>was to >> >> > avoid >> >> > > >>> > >>blocking >> >> > > >>> > >> >>too >> >> > > >>> > >> >> >> >>long >> >> > > >>> > >> >> >> >> on >> >> > > >>> > >> >> >> >> > > >> >>shutdown. >> >> > > >>> > >> >> >> >> > > >> >> >> >> > > >>> > >> >> >> >> > > >> >>-Jay >> >> > > >>> > >> >> >> >> > > >> >> >> >> > > >>> > >> >> >> >> > > >> >> >> >> > > >>> > >> >> >> >> > > >> >> >> >> > > >>> > >> >> >> >> > > >> >>On Sat, Mar 7, 2015 at 12:25 PM, >> >>Jiangjie Qin >> >> > > >>> > >> >> >> >> > > >><j...@linkedin.com.invalid >> >> > > >>> > >> >> >> >> > > >> > >> >> > > >>> > >> >> >> >> > > >> >>wrote: >> >> > > >>> > >> >> >> >> > > >> >> >> >> > > >>> > >> >> >> >> > > >> >>> Hi, >> >> > > >>> > >> >> >> >> > > >> >>> >> >> > > >>> > >> >> >> >> > > >> >>> I just created a KIP for adding a >> >> > > >>>close(timeout) >> >> > > >>> to >> >> > > >>> > >>new >> >> > > >>> > >> >> >> >> producer. >> >> > > >>> > >> >> >> >> > > >>Most >> >> > > >>> > >> >> >> >> > > >> >>>of >> >> > > >>> > >> >> >> >> > > >> >>> the previous discussions are in >> >>KAFKA-1660 >> >> > > >>>where >> >> > > >>> > >>Parth >> >> > > >>> > >> >> >> >> Brahmbhatt >> >> > > >>> > >> >> >> >> > > >>has >> >> > > >>> > >> >> >> >> > > >> >>> already done a lot of work. >> >> > > >>> > >> >> >> >> > > >> >>> Since this is an interface change so >> >>we are >> >> > > >>>going >> >> > > >>> > >> >>through >> >> > > >>> > >> >> >>the >> >> > > >>> > >> >> >> >> KIP >> >> > > >>> > >> >> >> >> > > >> >>>process. >> >> > > >>> > >> >> >> >> > > >> >>> Here is the KIP link: >> >> > > >>> > >> >> >> >> > > >> >>> >> >> > > >>> > >> >> >> >> > > >> >>> >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >> > > >>> > >> >> >> > > >>> > >> >> > > >>> >> >> > > >> >>>>https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=537397 >>>>8 >> >> > > >>> > >> >> >> >> > > >> >>>2 >> >> > > >>> > >> >> >> >> > > >> >>> >> >> > > >>> > >> >> >> >> > > >> >>> Thanks. >> >> > > >>> > >> >> >> >> > > >> >>> >> >> > > >>> > >> >> >> >> > > >> >>> Jiangjie (Becket) Qin >> >> > > >>> > >> >> >> >> > > >> >>> >> >> > > >>> > >> >> >> >> > > >> > >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> >> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > > >> >> > > >>> > >> >> >> >> > >> >> > > >>> > >> >> >> >> >> >> > > >>> > >> >> >> >> >> >> > > >>> > >> >> >> >> >> >> > > >>> > >> >> >> >> -- >> >> > > >>> > >> >> >> >> Thanks, >> >> > > >>> > >> >> >> >> Neha >> >> > > >>> > >> >> >> >> >> >> > > >>> > >> >> >> > >> >> > > >>> > >> >> >> > >> >> > > >>> > >> >> >> > >> >> > > >>> > >> >> >> >-- >> >> > > >>> > >> >> >> >-- Guozhang >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> >> >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> >-- >> >> > > >>> > >> >> >-- Guozhang >> >> > > >>> > >> >> >> >> > > >>> > >> >> >> >> > > >>> > >> > >> >> > > >>> > >> > >> >> > > >>> > >> >-- >> >> > > >>> > >> >-- Guozhang >> >> > > >>> > >> >> >> > > >>> > >> >> >> > > >>> > >> >> > > >>> > >> >> > > >>> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >>-- >> >> > > >>-- Guozhang >> >> > > > >> >> > > >> >> > > >> >> > >> >> >> >> >> >> >> >> -- >> >> Thanks, >> >> Neha >> > >> >>