Hey Guys,

Looks like Jun and Jiangjie have covered the motivation behind the config.
To me one would want to set ISR propagation delay to a high number
primarily during rolling upgrade. I think the delay one would want to have
in ISR propagation is proportional to the cluster size. However, during
normal operations a faster ISR propagation is desired. Having a config to
expose the delay time provides admin a way to control it, and it will come
with a default value so if someone does not want to play with it they can
choose not to.

@Gwen, does that answer your question?

On Sun, Aug 9, 2015 at 3:26 PM, Jiangjie Qin <j...@linkedin.com.invalid>
wrote:

> Jun,
>
> Thanks for the detailed explanation. Now I understand. Yes, we might not be
> able leverage the group commit in this case.
>
> When I was testing the patch, I also found a potential use case for the
> config (not sure if that is a strong use case though). When we rolling
> upgrade a cluster, if the controller is still running on an old version,
> the brokers got bounced will create a bunch of ZK path that will not be
> picked up by the controller until the controller is upgraded to the new
> version. It might be fine for small clusters, for larger clusters, there
> could be many ISR change got accumulated before the controller runs on the
> new version. So if we have the config, when people do rolling upgrade, they
> can first have the propagation interval to be very large so ISR change will
> not be propagated. After all the brokers are running on the new version, we
> can set the propagation interval back to a short value and bounce brokers
> again.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Sun, Aug 9, 2015 at 12:22 PM, Jun Rao <j...@confluent.io> wrote:
>
> > Jiangjie,
> >
> > Related to group commit, I think what Jay says is the following. Suppose
> > the propagation of ISR is slow and also creates back pressure, you can
> just
> > propagate the ISR one batch at a time as fast as you can. After one batch
> > is sent, you just collect all ISR changes that have accumulated since the
> > last propagation and propagate them as a batch. This way, the amount of
> > batching happens automatically based on the propagation rate and the
> delay
> > of propagation is minimized. However, this doesn't quite work if we
> > propagate the ISR changes by writing a notification path in ZK. There is
> no
> > back pressure: the notification can be written faster than the listener
> can
> > process it. So the explicit batching that you did essentially is to get
> > around this problem.
> >
> > As for the config, I can see a couple of use cases: (1) We have unit
> tests
> > that verify the propagation of ISR. To prevent those tests taking too
> long,
> > we can configure a smaller propagation interval. (2) When there are many
> > partitions and brokers, someone may want to increase the propagation
> > interval to reduce the ZK overhead. I agree that most people likely don't
> > need to change this setting since the default should be good enough.
> >
> > Thanks,
> >
> > Jun
> >
> >
> >
> > On Fri, Aug 7, 2015 at 4:06 PM, Gwen Shapira <g...@confluent.io> wrote:
> >
> > > Maybe Ashish can supply the use-case and tuning advice then :)
> > > I'm a -1 on adding new configurations that we can't quite explain.
> > >
> > > On Fri, Aug 7, 2015 at 3:57 PM, Jiangjie Qin <j...@linkedin.com.invalid
> >
> > > wrote:
> > >
> > > > Hi Gwen,
> > > >
> > > > Completely agree with you. I originally just hard coded it to be 10
> > > > seconds. Ashish raised this requirement in KAFKA-2406 because people
> > > might
> > > > want to ISR changes get propagated quicker.
> > > > I don't have a good use case myself. Personally I think hard code it
> is
> > > > fine although I don't object to make it configurable.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > >
> >
>



-- 

Regards,
Ashish

Reply via email to