Re: [VOTE] KIP-15 add a close with timeout to new producer

Joel Koshy Thu, 26 Mar 2015 15:40:52 -0700

Yes it is killing the thread (and effectively the producer as well)
but I think that is appropriate since calling close with a timeout from
callback is fundamentally a programming error - specifically an
illegal argument (in the context of the callback).


There are three options as I see it (in decreasing order of weirdness
IMO):
- The current proposal to block forever is as you said odd since
  hanging does not necessarily draw attention to what is a programming
  error.
- Document that close(t > 0) should not be called from callback and it
  will be treated as close(0). I think there is precedent in Java APIs
  to interpreting an argument as something else (e.g., negative
  treated as 0) but this is a step beyond that in that it is
  contextual. i.e., it is reinterpreted only when called from the
  callback.
- Throw the exception, which will have a similar effect to close(0)
  but that seems appropriate for a programming error.

That said, I think any of these are okay since we will be documenting
it.

Joel

On Thu, Mar 26, 2015 at 02:17:23PM -0700, Jay Kreps wrote:
> Hmm, but won't the impact of throwing the exception just be killing the
> sender thread? i.e. the user won't see it unless they check the logs, which
> is the same as logging the error.
> 
> Is there a problem with just logging an error and then blocking for the
> amount of time requested?
> 
> -Jay
> 
> On Thu, Mar 26, 2015 at 2:03 PM, Joel Koshy <jjkosh...@gmail.com> wrote:
> 
> > Talked to Jiangjie offline - actually looking at the code, we could
> > just extend java.lang.Error. Alternately we could throw an
> > IllegalArgumentException and though we catch Exception, we could catch
> > that explicitly and rethrow to cause the sender to just exit.
> >
> > On Thu, Mar 26, 2015 at 01:29:41PM -0700, Jay Kreps wrote:
> > > Hey guys,
> > >
> > > I think there are really two choices:
> > > 1. Blocking for the time the user requested
> > > 2. Blocking indefinitely irrespective of what the user requested
> > >
> > > When we were discussing I thought we were talking about (1) but I think
> > > really you were proposing (2).
> > >
> > > (1) seems defensible. After all you asked us to block for that long and
> > we
> > > did, we logged a warning so hopefully you will notice and correct it.
> > >
> > > (2) seems odd. There are many areas where we log an error but we never do
> > > sleep(INFINITY) to draw attention to the problem, right? Changing the
> > > blocking time to infinite is arguably as weird as changing it to 0, no?
> > >
> > > I don't want to prolong the discussion too long but I do think it is a
> > bit
> > > weird.
> > >
> > > I think that likely very few people will call close from within a
> > callback
> > > so it probably isn't a huge issue one way or another.
> > >
> > > -Jay
> > >
> > > On Thu, Mar 26, 2015 at 11:03 AM, Guozhang Wang <wangg...@gmail.com>
> > wrote:
> > >
> > > > I was previously preferring logging + exist, but Jiangjie had a point
> > that
> > > > by doing this we are effectively silently changing the close(>0) call
> > to
> > > > close(0) call although we log an error to it. The problem is that
> > users may
> > > > just think the close(>0) call exit normally and do not check the logs.
> > > >
> > > > Guozhang
> > > >
> > > > On Thu, Mar 26, 2015 at 6:31 AM, Jiangjie Qin
> > <j...@linkedin.com.invalid>
> > > > wrote:
> > > >
> > > > > Hi Neha,
> > > > >
> > > > > I totally agree that from program behavior point of view, blocking
> > is not
> > > > > a good idea.
> > > > >
> > > > > I think the ultimate question here is whether we define calling
> > > > > close()/close(timeout) from callback as a legal usage or not.
> > > > >
> > > > > If it is a legal usage, logging a warning and exit makes perfect
> > sense,
> > > > we
> > > > > just need to document that in this case close() is same as close(0).
> > > > >
> > > > > On the other hand, if we define this as an illegal usage and want
> > users
> > > > to
> > > > > correct it, blocking has merit in some cases.
> > > > > Let零 say I雋 writing a unit test for exit-on-send-faiure and call
> > > > close()
> > > > > in callback, if we detect the mis-usage of close() and replace it
> > with
> > > > > close(0). User might never know that they were not using the right
> > > > > interface in the callback, because the unit test will pass just with
> > an
> > > > > error log which might not even be printed. So we are indulging user
> > to
> > > > use
> > > > > a wrong interface in this case.
> > > > >
> > > > > My previous assumption was that we don靖 want to allow user to use
> > > > > close()/close(timeout) in callback. That零 why I preferred blocking.
> > > > >
> > > > > That said, I do not have strong opinion over the options, so I雋 happy
> > > > > with whichever way we decide to go.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On 3/25/15, 9:02 PM, "Neha Narkhede" <n...@confluent.io> wrote:
> > > > >
> > > > > >>
> > > > > >> We have agreed that we will have an error log to inform user about
> > > > this
> > > > > >> mis-usage. The options differ in the way how we can force user to
> > > > take a
> > > > > >> look at that error log.
> > > > > >
> > > > > >
> > > > > >Since we have to detect the problem in order to log an appropriate
> > error
> > > > > >message, we have a way to tell if the user is doing something that
> > will
> > > > > >cause their application to block indefinitely, which can't be ideal
> > > > > >behavior in any case I can imagine.
> > > > > >
> > > > > >My concern is that this might add to this long list
> > > > > ><http://ingest.tips/2015/03/26/what-is-confusing-about-kafka/> of
> > > > > >confusing
> > > > > >Kafka behaviors, so I'd vote to log an appropriate error message and
> > > > exit.
> > > > > >
> > > > > >On Wed, Mar 25, 2015 at 10:04 AM, Jiangjie Qin
> > > > <j...@linkedin.com.invalid
> > > > > >
> > > > > >wrote:
> > > > > >
> > > > > >> Hi Jay,
> > > > > >>
> > > > > >> The reason we keep the blocking behavior if close() or
> > close(timeout)
> > > > is
> > > > > >> called from callback is discussed in another thread.
> > > > > >> I copy/paste the message here:
> > > > > >>
> > > > > >> It looks that the problem we want to solve and the purpose we
> > want to
> > > > > >> achieve is:
> > > > > >> If user uses close() in callback, we want to let user be aware
> > that
> > > > they
> > > > > >> should use close(0) instead of close() in the callback.
> > > > > >>
> > > > > >> We have agreed that we will have an error log to inform user about
> > > > this
> > > > > >> mis-usage. The options differ in the way how we can force user to
> > > > take a
> > > > > >> look at that error log.
> > > > > >> There are two scenarios:
> > > > > >> 1. User does not expect the program to exit.
> > > > > >> 2. User expect the program to exit.
> > > > > >>
> > > > > >> For scenario 1), blocking will probably delay the discovery of the
> > > > > >> problem. Calling close(0) exposes the problem quicker. In this
> > > > scenario
> > > > > >> producer just encounter a send failure when running normally.
> > > > > >> For scenario 2), blocking will expose the problem quick. Calling
> > > > > >>close(-1)
> > > > > >> might hide the problem. This scenario might include: a) Unit test
> > for
> > > > a
> > > > > >> send failure. b) Message sent during a close() call from a user
> > > > thread.
> > > > > >>
> > > > > >> So as a summary table:
> > > > > >>
> > > > > >>                  Scenario 1)                     Scenario 2)
> > > > > >>
> > > > > >> Blocking    Delay problem discovery        Guaranteed problem
> > > > discovery
> > > > > >>
> > > > > >> Close(-1)   Immediate problem discovery    Problem might be hidden
> > > > > >>
> > > > > >>
> > > > > >> Personally I prefer blocking because it seems providing more
> > > > guarantees
> > > > > >> and safer.
> > > > > >>
> > > > > >> Thanks.
> > > > > >>
> > > > > >> Jiangjie (Becket) Qin
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On 3/25/15, 9:20 AM, "Jay Kreps" <jay.kr...@gmail.com> wrote:
> > > > > >>
> > > > > >> >Wait, actually, why would the thread block forever? I would
> > > > understand
> > > > > >> >throwing an error and failing immediately (fail fast) and I would
> > > > > >> >understand logging an error and blocking for the time they
> > specified
> > > > > >> >(since
> > > > > >> >that is what they asked for), but the logging an error and
> > putatively
> > > > > >> >blocking forever if they only specified a wait time of say 15 ms
> > just
> > > > > >> >seems
> > > > > >> >weird, right? There is no other error condition where we
> > > > intentionally
> > > > > >> >block forever as far as I know.
> > > > > >> >
> > > > > >> >Sorry to keep going around and around on this minor point I just
> > want
> > > > > >>to
> > > > > >> >clarify that that is what you mean.
> > > > > >> >
> > > > > >> >-Jay
> > > > > >> >
> > > > > >> >On Wed, Mar 25, 2015 at 9:17 AM, Jay Kreps <jay.kr...@gmail.com>
> > > > > wrote:
> > > > > >> >
> > > > > >> >> +1
> > > > > >> >>
> > > > > >> >> -Jay
> > > > > >> >>
> > > > > >> >> On Tue, Mar 24, 2015 at 2:43 PM, Guozhang Wang <
> > wangg...@gmail.com
> > > > >
> > > > > >> >>wrote:
> > > > > >> >>
> > > > > >> >>> +1.
> > > > > >> >>>
> > > > > >> >>> On Tue, Mar 24, 2015 at 2:15 PM, Jiangjie Qin
> > > > > >> >>><j...@linkedin.com.invalid>
> > > > > >> >>> wrote:
> > > > > >> >>>
> > > > > >> >>> >
> > > > > >> >>> >
> > > > > >> >>>
> > > > > >> >>>
> > > > > >>
> > > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-15+-+Add+a+close+m
> > > > > >> >>>ethod+with+a+timeout+in+the+producer
> > > > > >> >>> >
> > > > > >> >>> > As a short summary, the new interface will be as following:
> > > > > >> >>> > Close(Long Timeout, TimeUnit timeUnit)
> > > > > >> >>> >
> > > > > >> >>> >   1.  When timeout > 0, it will try to wait up to timeout
> > for
> > > > the
> > > > > >> >>>sender
> > > > > >> >>> > thread to complete all the requests, then join the sender
> > > > thread.
> > > > > >>If
> > > > > >> >>>the
> > > > > >> >>> > sender thread is not able to finish work before timeout, the
> > > > > >>method
> > > > > >> >>> force
> > > > > >> >>> > close the producer by fail all the incomplete requests and
> > join
> > > > > >>the
> > > > > >> >>> sender
> > > > > >> >>> > thread.
> > > > > >> >>> >   2.  When timeout = 0, it will be a non-blocking call, just
> > > > > >>initiate
> > > > > >> >>> the
> > > > > >> >>> > force close and DOES NOT join the sender thread.
> > > > > >> >>> >
> > > > > >> >>> > If close(timeout) is called from callback, an error message
> > will
> > > > > >>be
> > > > > >> >>> logged
> > > > > >> >>> > and the producer sender thread will block forever.
> > > > > >> >>> >
> > > > > >> >>> >
> > > > > >> >>>
> > > > > >> >>>
> > > > > >> >>> --
> > > > > >> >>> -- Guozhang
> > > > > >> >>>
> > > > > >> >>
> > > > > >> >>
> > > > > >>
> > > > > >>
> > > > > >
> > > > > >
> > > > > >--
> > > > > >Thanks,
> > > > > >Neha
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> >
> >

-- 
Joel

Re: [VOTE] KIP-15 add a close with timeout to new producer

Reply via email to