Exactly. I also can't envision scenarios where we would like to throttle
the reassignment traffic to only a subset of the reassigning replicas.

The other day I was wondering about that with specialized quotas we can
solve the incremental partition reassignment too. Basically the controller
would throttle most of the partitions to 0 and let only some of them to
reassign but I discarded the idea because it is more intuitive to actually
break up a big reassignment into smaller steps (and more traceable too).
But perhaps there is a need for throttling the reassigning replicas
differently depending on the produce rate on those partitions, however in
my mind I was planning with the incremental partition reassignment so
perhaps it'd be the best if the controller would be able to decide how many
partition can be fitted into the given bandwidth and we'd just expose
simple configs.

If we always take the lowest value, this means that the reassignment
throttle must always be equal to or lower than the replication throttle.
Doesn't that mean that the reassigning partitions may never catch up? I
guess not, since we expect to always be moving less than the total number
of partitions at one time.
I have mixed feelings about this - I like the flexibility of being able to
configure whatever value we please, yet I struggle to come up with a
scenario where we would want a higher reassignment throttle than
replication. Perhaps your suggestion is better.

Yes it could mean that, however concern with preferring reassignment quotas
is that it could cause the "bootstrapping broker problem", so the sum of
follower reassignment + replication quotas would eat away the bandwidth
from the leaders. In this case I think it's a better problem to have a
reassignment that you can't finish than leaders unable to answer fetch
requests fast enough. The reassignment problem can be mitigated in this
case by carefully increasing the replication & reassignment quotas in this
case for the given partition. I'll set up a test environment for this
though and get back if something doesn't add up.

This begs another question - since we're separating the replication
throttle from the reassignment throttle, the maximum traffic a broker may
replicate now becomes `replication.throttled.rate` + `
reassignment.throttled.rate`
Seems like we would benefit from having a total cap to ensure users don't
shoot themselves in the foot.

We could have a new config that denotes the total possible throttle rate
and we then divide that by reassignment and replication. But that assumes
that we would set the replication.throttled.rate much lower than what the
broker could handle.

Perhaps the best approach would be to denote how much the broker can handle
(total.replication.throttle.rate) and then allow only up to N% of that go
towards reassignments (reassignment.throttled.rate) in a best-effort way
(preferring replication traffic). That sounds tricky to implement though
Interested to hear what others think

Good catch. I'm also leaning towards to having simpler configs and
improving the broker/controller code to make more intelligent decisions. I
also agree with having a total.replication.throttle.rate but I think we
should stay with the byte based notation as that is more conventional in
the quota world and easier to handle. That way you can say that your total
replication quota is 10, your leader and follower replication quota is 3
each, the reassignment ones are 2 each and then you maxed out your limit.
We can print warnings/errors if the overall value doesn't match up to the
max.

Viktor

On Mon, Nov 4, 2019 at 12:27 PM Stanislav Kozlovski <stanis...@confluent.io>
wrote:

> Hi Viktor,
>
> > As for the first question I think is no need for *.throttled.replicas in
> case of reassignment because the LeaderAndIsrRequest exactly specifies the
> replicas needed to be throttled.
>
> Exactly. I also can't envision scenarios where we would like to throttle
> the reassignment traffic to only a subset of the reassigning replicas.
>
> > For instance a bootstrapping server where all replicas are throttled and
> there are reassigning replicas and the reassignment throttle set higher I
> think we should still apply the replication throttle to ensure the broker
> won't have problems. What do you think?
>
> If we always take the lowest value, this means that the reassignment
> throttle must always be equal to or lower than the replication throttle.
> Doesn't that mean that the reassigning partitions may never catch up? I
> guess not, since we expect to always be moving less than the total number
> of partitions at one time.
> I have mixed feelings about this - I like the flexibility of being able to
> configure whatever value we please, yet I struggle to come up with a
> scenario where we would want a higher reassignment throttle than
> replication. Perhaps your suggestion is better.
>
> This begs another question - since we're separating the replication
> throttle from the reassignment throttle, the maximum traffic a broker may
> replicate now becomes `replication.throttled.rate` + `
> reassignment.throttled.rate`
> Seems like we would benefit from having a total cap to ensure users don't
> shoot themselves in the foot.
>
> We could have a new config that denotes the total possible throttle rate
> and we then divide that by reassignment and replication. But that assumes
> that we would set the replication.throttled.rate much lower than what the
> broker could handle.
>
> Perhaps the best approach would be to denote how much the broker can handle
> (total.replication.throttle.rate) and then allow only up to N% of that go
> towards reassignments (reassignment.throttled.rate) in a best-effort way
> (preferring replication traffic). That sounds tricky to implement though
> Interested to hear what others think
>
> Best,
> Stanislav
>
>
> On Mon, Nov 4, 2019 at 11:08 AM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> wrote:
>
> > Hey Stan,
> >
> > > We will introduce two new configs in order to eventually replace
> > *.replication.throttled.rate.
> > Just to clarify, you mean to replace said config in the context of
> > reassignment throttling, right? We are not planning to remove that config
> >
> > Yes, I don't want to remove that config either. Removed that sentence.
> >
> > And also to clarify, *.throttled.replicas will not apply to the new
> > *reassignment* configs, correct? We will throttle all reassigning
> replicas.
> > (I am +1 on this, I believe it is easier to reason about. We could always
> > add a new config later)
> >
> > Are you asking whether there is a need for a
> > leader.reassignment.throttled.replicas and
> > follower.reassignment.throttled.replicas config or are you interested in
> > the behavior between the old and the new configs?
> > As for the first question I think is no need for *.throttled.replicas in
> > case of reassignment because the LeaderAndIsrRequest exactly specifies
> the
> > replicas needed to be throttled.
> > As for the second, see below.
> >
> > I have one comment about backwards-compatibility - should we ensure that
> > the old `*.replication.throttled.rate` and `*.throttled.replicas` still
> > apply to reassigning traffic if set? We could have the new config take
> > precedence, but still preserve backwards compatibility.
> >
> > Sure, we should apply replication throttling to reassignment too if set.
> > But instead of the new taking precedence I'd apply whichever has lower
> > value.
> > For instance a bootstrapping server where all replicas are throttled and
> > there are reassigning replicas and the reassignment throttle set higher I
> > think we should still apply the replication throttle to ensure the broker
> > won't have problems. What do you think?
> >
> > Thanks,
> > Viktor
> >
> >
> > On Fri, Nov 1, 2019 at 9:57 AM Stanislav Kozlovski <
> stanis...@confluent.io
> > >
> > wrote:
> >
> > > Hey Viktor. Thanks for the KIP!
> > >
> > > > We will introduce two new configs in order to eventually replace
> > > *.replication.throttled.rate.
> > > Just to clarify, you mean to replace said config in the context of
> > > reassignment throttling, right? We are not planning to remove that
> config
> > >
> > > And also to clarify, *.throttled.replicas will not apply to the new
> > > *reassignment* configs, correct? We will throttle all reassigning
> > replicas.
> > > (I am +1 on this, I believe it is easier to reason about. We could
> always
> > > add a new config later)
> > >
> > > I have one comment about backwards-compatibility - should we ensure
> that
> > > the old `*.replication.throttled.rate` and `*.throttled.replicas` still
> > > apply to reassigning traffic if set? We could have the new config take
> > > precedence, but still preserve backwards compatibility.
> > >
> > > Thanks,
> > > Stanislav
> > >
> > > On Thu, Oct 24, 2019 at 1:38 PM Viktor Somogyi-Vass <
> > > viktorsomo...@gmail.com>
> > > wrote:
> > >
> > > > Hi People,
> > > >
> > > > I've created a KIP to improve replication quotas by handling
> > reassignment
> > > > related throttling as a separate case with its own configurable
> limits
> > > and
> > > > change the kafka-reassign-partitions tool to use these new configs
> > going
> > > > forward.
> > > > Please have a look, I'd be happy to receive any feedback and answer
> > > > all your questions.
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling
> > > >
> > > > Thanks,
> > > > Viktor
> > > >
> > >
> > >
> > > --
> > > Best,
> > > Stanislav
> > >
> >
>
>
> --
> Best,
> Stanislav
>

Reply via email to