Re: [DISCUSS] KIP-152 - Improve diagnostics for SASL authentication failures

Ismael Juma Thu, 04 May 2017 07:02:13 -0700

H Rajini,

Comments inline.


On Thu, May 4, 2017 at 2:29 PM, Rajini Sivaram <rajinisiva...@gmail.com>
wrote:

> Hi Ismael,
>
> Thank you for reviewing the KIP.
>
> An authenticated client that is not authorized to access a topic is never
> told that the operation was not authorized. This is to prevent the client
> from finding out if the topic exists by sending an unauthorized request. So
> in this case, the client will retry metadata requests with the configured
> backoff until it times out.


This is true if the user does not have Describe permission. If the user has
Describe access and no Read or Write access, then the user is informed that
the operation was not authorized.


> Another important distinction for authorization failures is that the
> connection is not terminated.
>
> For unauthenticated clients, we do want to inform the client that
> authentication failed. The connection is terminated by the broker.
> Especially if the client is using SASL_SSL, we really do want to avoid
> reconnections that result in unnecessary expensive handshakes. So we want
> to return an exception to the user with minimal retries.
>

Agreed.

I was thinking that it may be useful to try more than one broker for the
> case where brokers are being upgraded and some brokers haven't yet seen the
> latest credentials. I suppose I was thinking that at the moment we keep on
> retrying every broker forever in the consumer and suddenly if we stop
> retrying altogether, it could potentially lead to some unforeseen timing
> issues. Hence the suggestion to try every broker once.
>

I see. Retrying forever is a side-effect of auto-topic creation, but it's
something we want to move away from. As mentioned, we actually don't retry
at all if the user has Describe permission.

Broker upgrades could be fixed by ensuring that the latest credentials are
loaded before the broker starts serving requests. More problematic is
dealing with credential updates. This is another distinction when compared
to authorization.

I am not sure if trying different brokers really helps us though. Say, we
fail to authenticate with broker 0 and then we succeed with broker 1. This
helps with metadata requests, but we will be in trouble when we try to
produce or consume to broker 0 (because it's the leader of some
partitions). So maybe we just want to retry with a suitable backoff until a
timeout?

Yes, I agree that blacking out nodes forever isn't a good idea. When we
> throw AuthenticationFailedException for the current operation or if
> authentication to another broker succeeds, we can clear the blackout so
> that any new request from the client can attempt reconnection after the
> reconnect backoff period as they do now.
>

Yes, that would be better if we decide that connecting to different brokers
is worthwhile for the requests that can be sent to any broker.

Ismael

On Thu, May 4, 2017 at 2:29 PM, Rajini Sivaram <rajinisiva...@gmail.com>
wrote:

> Hi Ismael,
>
> Thank you for reviewing the KIP.
>
> An authenticated client that is not authorized to access a topic is never
> told that the operation was not authorized. This is to prevent the client
> from finding out if the topic exists by sending an unauthorized request. So
> in this case, the client will retry metadata requests with the configured
> backoff until it times out. Another important distinction for authorization
> failures is that the connection is not terminated.
>
> For unauthenticated clients, we do want to inform the client that
> authentication failed. The connection is terminated by the broker.
> Especially if the client is using SASL_SSL, we really do want to avoid
> reconnections that result in unnecessary expensive handshakes. So we want
> to return an exception to the user with minimal retries.
>
> I was thinking that it may be useful to try more than one broker for the
> case where brokers are being upgraded and some brokers haven't yet seen the
> latest credentials. I suppose I was thinking that at the moment we keep on
> retrying every broker forever in the consumer and suddenly if we stop
> retrying altogether, it could potentially lead to some unforeseen timing
> issues. Hence the suggestion to try every broker once.
>
> Yes, I agree that blacking out nodes forever isn't a good idea. When we
> throw AuthenticationFailedException for the current operation or if
> authentication to another broker succeeds, we can clear the blackout so
> that any new request from the client can attempt reconnection after the
> reconnect backoff period as they do now.
>
> Regards,
>
> Rajini
>
> On Thu, May 4, 2017 at 12:51 PM, Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Thanks Rajini. This is a good improvement. One question, the proposal
> > states:
> >
> > Producer waitForMetadata and consumer ensureCoordinatorReady will be
> > > updated to throw AuthenticationFailedException if connections to all
> > > available brokers fail authentication.
> >
> >
> > Can you elaborate on the reason why we would treat authentication
> failures
> > differently from authorization failures? It would be good to understand
> > under which scenario it would be beneficial to try all the brokers (it
> > seems that the proposal also suggests blacking out brokers permanently if
> > we fail authentication, so that could also eventually cause issues).
> >
> > Ismael
> >
> >
> > On Thu, May 4, 2017 at 12:37 PM, Rajini Sivaram <rajinisiva...@gmail.com
> >
> > wrote:
> >
> > > Hi all,
> > >
> > > I have created a KIP to improve diagnostics for SASL authentication
> > > failures and reduce retries and blocking when authentication fails:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-152+-+
> > > Improve+diagnostics+for+SASL+authentication+failures
> > >
> > > Comments and suggestions are welcome.
> > >
> > > Thank you...
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> >
>

Re: [DISCUSS] KIP-152 - Improve diagnostics for SASL authentication failures

Reply via email to