Hi Colin, The read/modify/write is protected by the zk version, right?
Ismael On Fri, Apr 17, 2020 at 12:53 PM Colin McCabe <cmcc...@apache.org> wrote: > On Thu, Apr 16, 2020, at 08:51, Ismael Juma wrote: > > I don't think these requests are necessarily infrequent under multi > tenant > > environments though. I've seen Controller availability being an issue for > > describe topics for example (before it was changed to go to any broker). > > Hi Ismael, > > I don't think DescribeTopics is a good comparison. That RPC is available > to regular users and is used many orders of magnitude more frequently than > administrative operations like changing ACLs or setting quotas. > > The operations we're talking about redirecting here all require the > highest possible permissions and will not be frequent in any real-world > cluster... unless someone is running a stress-test or a benchmark. We > didn't even notice some of the serious bugs in setting dynamic configs > until recently because the alterConfigs / incrementalAlterConfigs RPCs are > so infrequently called. > > Additionally, this KIP fixes some existing bugs. The current approach of > having random writers do a read-write-modify cycle on a configuration znode > is buggy since it could be interleaved with another node's read-modify > write cycle. It has a "lost updates" problem. > > For example, node 1 reads a config znode. Node 2 reads the same config > znode. Node 1 writes back a modified version of the znode. Node 2 writes > back its (differently) modified version, overwriting the changes from node > 1. > > I don't think anyone ever noticed this problem since, again, these > operations are very infrequent, making the chance of such a collision low. > But it is a serious bug that is fixed by having a single writer. (We > should add this to the KIP...) > > > > > Would it be better to redirect once the controller quorum is there? > > This KIP is needed for the bridge release. The bridge release upgrade > process relies on the old nodes sending their administrative operations to > the controller quorum, not directly to zookeeper. > > best, > Colin > > > > > > Note that this is different from things like AlterIsr since these calls > are > > coming from clients versus other brokers. > > > > Ismael > > > > On Wed, Apr 15, 2020, 5:10 PM Colin McCabe <cmcc...@apache.org> wrote: > > > > > Hi Ismael, > > > > > > I agree that sending these requests through the controller will not > work > > > during the periods when there is no controller. However, those periods > > > should be short-- otherwise we have bigger problems in the cluster. > > > > > > These requests are very infrequent because they are administrative > > > operations. Basically the affected operations are changing ACLs, > changing > > > dynamic configurations, and changing quotas. > > > > > > best, > > > Colin > > > > > > > > > On Wed, Apr 15, 2020, at 15:25, Ismael Juma wrote: > > > > Hi Boyang, > > > > > > > > Thanks for the KIP. Have we considered that this reduces > availability for > > > > these operations since we have a single Controller instead of the ZK > > > quorum? > > > > > > > > Ismael > > > > > > > > On Fri, Apr 3, 2020 at 4:45 PM Boyang Chen < > reluctanthero...@gmail.com> > > > > wrote: > > > > > > > > > Hey all, > > > > > > > > > > I would like to start off the discussion for KIP-590, a follow-up > > > > > initiative after KIP-500: > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-590%3A+Redirect+Zookeeper+Mutation+Protocols+to+The+Controller > > > > > > > > > > This KIP proposes to migrate existing Zookeeper mutation paths, > > > including > > > > > configuration, security and quota changes, to controller-only by > always > > > > > routing these alterations to the controller. > > > > > > > > > > Let me know your thoughts! > > > > > > > > > > Best, > > > > > Boyang > > > > > > > > > > > > > > >