Just to elaborate on what Ben said why we need throttling on both the leader and the follower side.
If we only have throttling on the follower side, consider a case that we add 5 more new brokers and want to move some replicas from existing brokers over to those 5 brokers. Each of those broker is going to fetch data from all existing brokers. Then, it's possible that the aggregated fetch load from those 5 brokers on a particular existing broker exceeds its outgoing network bandwidth, even though the inbounding traffic on each of those 5 brokers is bounded. If we only have throttling on the leader side, consider the same example above. It's possible for the incoming traffic to each of those 5 brokers to exceed its network bandwidth since it is fetching data from all existing brokers. So, being able to set a quota on both the follower and the leader side protects both cases. Thanks, Jun On Tue, Aug 9, 2016 at 4:43 AM, Ben Stopford <b...@confluent.io> wrote: > Hi Joel > > Thanks for taking the time to look at this. Appreciated. > > Regarding throttling on both leader and follower, this proposal covers a > more general solution which can guarantee a quota, even when a rebalance > operation produces an asymmetric profile of load. This means administrators > don’t need to calculate the impact that a follower-only quota will have on > the leaders they are fetching from. So for example where replica sizes are > skewed or where a partial rebalance is required. > > Having said that, even with both leader and follower quotas, the use of > additional threads is actually optional. There appear to be two general > approaches (1) omit partitions from fetch requests (follower) / fetch > responses (leader) when they exceed their quota (2) delay them, as the > existing quota mechanism does, using separate fetchers. Both appear valid, > but with slightly different design tradeoffs. > > The issue with approach (1) is that it departs somewhat from the existing > quotas implementation, and must include a notion of fairness within, the > now size-bounded, request and response. The issue with (2) is guaranteeing > ordering of updates when replicas shift threads, but this is handled, for > the most part, in the code today. > > I’ve updated the rejected alternatives section to make this a little > clearer. > > B > > > > > On 8 Aug 2016, at 20:38, Joel Koshy <jjkosh...@gmail.com> wrote: > > > > Hi Ben, > > > > Thanks for the detailed write-up. So the proposal involves > self-throttling > > on the fetcher side and throttling at the leader. Can you elaborate on > the > > reasoning that is given on the wiki: *“The throttle is applied to both > > leaders and followers. This allows the admin to exert strong guarantees > on > > the throttle limit".* Is there any reason why one or the other wouldn't > be > > sufficient. > > > > Specifically, if we were to only do self-throttling on the fetchers, we > > could potentially avoid the additional replica fetchers right? i.e., the > > replica fetchers would maintain its quota metrics as you proposed and > each > > (normal) replica fetch presents an opportunity to make progress for the > > throttled partitions as long as their effective consumption rate is below > > the quota limit. If it exceeds the consumption rate then don’t include > the > > throttled partitions in the subsequent fetch requests until the effective > > consumption rate for those partitions returns to within the quota > threshold. > > > > I have more questions on the proposal, but was more interested in the > above > > to see if it could simplify things a bit. > > > > Also, can you open up access to the google-doc that you link to? > > > > Thanks, > > > > Joel > > > > On Mon, Aug 8, 2016 at 5:54 AM, Ben Stopford <b...@confluent.io> wrote: > > > >> We’ve created KIP-73: Replication Quotas > >> > >> The idea is to allow an admin to throttle moving replicas. Full details > >> are here: > >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-73+ > >> Replication+Quotas <https://cwiki.apache.org/conf > >> luence/display/KAFKA/KIP-73+Replication+Quotas> > >> > >> Please take a look and let us know your thoughts. > >> > >> Thanks > >> > >> B > >> > >> > >