Re: [DISCUSS] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

Boyang Chen Tue, 21 Apr 2020 11:33:26 -0700

Hey Tom,

I agree with the claim here. All the brokers should have the same
authentication power, which means getting the forwarding broker verify the
client request first is more favorable. This approach avoids sending one
unnecessary forwarding request if it couldn't pass the authorization in the
first place.


In the meantime, could you give more context on the custom Kafka principal
you are referring to? How does that get encoded today, and how server and
client could both agree on the serialization? As the plain principal is
only a String, I would like to know more about the security strategy people
are using, thanks!

Boyang

On Tue, Apr 21, 2020 at 2:24 AM Tom Bentley <tbent...@redhat.com> wrote:

> Hi Boyang,
>
> The answer to my original question about the request principal was that the
> forwarding broker would authorize the request and the controller would
> trust the request since it was from another broker. AFAIU you added the
> principal purely for logging purposes. In the "EnvelopeRequest Handling"
> section the KIP now says "Once that part is done, we shall replace the
> request context with Principal information embedded inside the
> EnvelopeRequest to complete the inner request permission check.", which
> sounds to me like the controller is now authorizing the request (maybe in
> addition to the forwarding broker) using a principal it's deserialized from
> the EnvelopeRequest. I don't think that works if a custom principal builder
> is returning a subclass of KafkaPrincipal (the Javadoc for KafkaPrincipal
> describes the contract I'm talking about). Basically the controller would
> not be able to instantiate the subclass (even if that was included in the
> envelope) because it wouldn't necessarily know the signature of the
> constructor. Nor can it use the principal builder itself because it doesn't
> have access to the original AuthenticationContext. Maybe you figure out
> some way to make it work, otherwise I think the best you can do is to
> revert to the model you had before where the controller trusts the embedded
> request because it's been received from a broker.
>
> Cheers,
>
> Tom
>
> On Sat, Apr 18, 2020 at 8:56 PM Colin McCabe <cmcc...@apache.org> wrote:
>
> > On Fri, Apr 17, 2020, at 13:11, Ismael Juma wrote:
> > > Hi Colin,
> > >
> > > The read/modify/write is protected by the zk version, right?
> > >
> > > Ismael
> >
> > No, we don't use the ZK version when doing the write to the config
> > znodes.  We do for ACLs, I think.
> >
> > This is something that we could fix just by using the ZK version, but
> > there are other race conditions like what if we're deleting a topic while
> > setting this config, etc.  A single writer is a lot easier to reason
> about.
> >
> > best,
> > Colin
> >
> >
> > >
> > > On Fri, Apr 17, 2020 at 12:53 PM Colin McCabe <cmcc...@apache.org>
> > wrote:
> > >
> > > > On Thu, Apr 16, 2020, at 08:51, Ismael Juma wrote:
> > > > > I don't think these requests are necessarily infrequent under multi
> > > > tenant
> > > > > environments though. I've seen Controller availability being an
> > issue for
> > > > > describe topics for example (before it was changed to go to any
> > broker).
> > > >
> > > > Hi Ismael,
> > > >
> > > > I don't think DescribeTopics is a good comparison.  That RPC is
> > available
> > > > to regular users and is used many orders of magnitude more frequently
> > than
> > > > administrative operations like changing ACLs or setting quotas.
> > > >
> > > > The operations we're talking about redirecting here all require the
> > > > highest possible permissions and will not be frequent in any
> real-world
> > > > cluster... unless someone is running a stress-test or a benchmark.
> We
> > > > didn't even notice some of the serious bugs in setting dynamic
> configs
> > > > until recently because the alterConfigs / incrementalAlterConfigs
> RPCs
> > are
> > > > so infrequently called.
> > > >
> > > > Additionally, this KIP fixes some existing bugs.  The current
> approach
> > of
> > > > having random writers do a read-write-modify cycle on a configuration
> > znode
> > > > is buggy since it could be interleaved with another node's
> read-modify
> > > > write cycle.  It has a "lost updates" problem.
> > > >
> > > > For example, node 1 reads a config znode.  Node 2 reads the same
> config
> > > > znode.  Node 1 writes back a modified version of the znode.  Node 2
> > writes
> > > > back its (differently) modified version, overwriting the changes from
> > node
> > > > 1.
> > > >
> > > > I don't think anyone ever noticed this problem since, again, these
> > > > operations are very infrequent, making the chance of such a collision
> > low.
> > > > But it is a serious bug that is fixed by having a single writer.  (We
> > > > should add this to the KIP...)
> > > >
> > > > >
> > > > > Would it be better to redirect once the controller quorum is there?
> > > >
> > > > This KIP is needed for the bridge release.  The bridge release
> upgrade
> > > > process relies on the old nodes sending their administrative
> > operations to
> > > > the controller quorum, not directly to zookeeper.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > >
> > > > > Note that this is different from things like AlterIsr since these
> > calls
> > > > are
> > > > > coming from clients versus other brokers.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Wed, Apr 15, 2020, 5:10 PM Colin McCabe <cmcc...@apache.org>
> > wrote:
> > > > >
> > > > > > Hi Ismael,
> > > > > >
> > > > > > I agree that sending these requests through the controller will
> not
> > > > work
> > > > > > during the periods when there is no controller.  However, those
> > periods
> > > > > > should be short-- otherwise we have bigger problems in the
> cluster.
> > > > > >
> > > > > > These requests are very infrequent because they are
> administrative
> > > > > > operations.  Basically the affected operations are changing ACLs,
> > > > changing
> > > > > > dynamic configurations, and changing quotas.
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > > >
> > > > > > On Wed, Apr 15, 2020, at 15:25, Ismael Juma wrote:
> > > > > > > Hi Boyang,
> > > > > > >
> > > > > > > Thanks for the KIP. Have we considered that this reduces
> > > > availability for
> > > > > > > these operations since we have a single Controller instead of
> > the ZK
> > > > > > quorum?
> > > > > > >
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Fri, Apr 3, 2020 at 4:45 PM Boyang Chen <
> > > > reluctanthero...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hey all,
> > > > > > > >
> > > > > > > > I would like to start off the discussion for KIP-590, a
> > follow-up
> > > > > > > > initiative after KIP-500:
> > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-590%3A+Redirect+Zookeeper+Mutation+Protocols+to+The+Controller
> > > > > > > >
> > > > > > > > This KIP proposes to migrate existing Zookeeper mutation
> paths,
> > > > > > including
> > > > > > > > configuration, security and quota changes, to controller-only
> > by
> > > > always
> > > > > > > > routing these alterations to the controller.
> > > > > > > >
> > > > > > > > Let me know your thoughts!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Boyang
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
>

Re: [DISCUSS] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

Reply via email to