Hi community ### Summary The Admin client (`pulsar-admin`) and Java Client (PulsarAdmin) will throw Unauthorized Ex in both scenarios: - If there have more than one broker in a cluster( see issue 1 below ). - If authentication is enabled for both Pulsar-Proxy and Pulsar-Broker( see issue 2 below),
``` bin/pulsar-admin topics stats persistent://public/default/tp1 2023-03-28T07:30:58,453+0000 [main] INFO org.apache.pulsar.client.impl.auth.AuthenticationSasl - JAAS loginContext is: PulsarAdmin. 2023-03-28T07:30:58,583+0000 [main] INFO org.apache.pulsar.common.sasl.JAASCredentialsContainer - successfully logged in. 2023-03-28T07:30:58,587+0000 [pulsar-tgt-refresh-thread] INFO org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh thread started. 2023-03-28T07:30:58,612+0000 [pulsar-tgt-refresh-thread] INFO org.apache.pulsar.common.sasl.TGTRefreshThread - Client principal is " pulsar-ad...@sn.io". 2023-03-28T07:30:58,613+0000 [pulsar-tgt-refresh-thread] INFO org.apache.pulsar.common.sasl.TGTRefreshThread - Server principal is "krbtgt/sn...@sn.io". 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO org.apache.pulsar.common.sasl.TGTRefreshThread - TGT valid starting at: Tue Mar 28 07:30:58 UTC 2023 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO org.apache.pulsar.common.sasl.TGTRefreshThread - TGT expires: Wed Mar 29 07:30:58 UTC 2023 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh sleeping until: Wed Mar 29 03:12:29 UTC 2023 2023-03-28T07:30:59,861+0000 [main] INFO org.apache.pulsar.client.impl.auth.PulsarSaslClient - Using JAAS/SASL/GSSAPI auth to connect to server Principal broker/pulsar03, HTTP 401 Unauthorized Reason: HTTP 401 Unauthorized ``` And I want to cherry-pick https://github.com/apache/pulsar/pull/15121 into branch-2.10 to fix it. ### Background When using Kerberos for authentication, Pulsar works like this: - client: init ticket - request to broker - broker identifies the client (Broker can confirm the ticket is valid by Kerberos) - sends a token(we call it sasl_role_token) to the client ( at this moment, the session is successfully created ) - then the client will be authenticated through sasl_role_token, do not use Kerberos anymore. The `sasl_role_token` is generated by this logic: `Sha512(saslRoleName, ${secret})`, we call the `secret` sasl_sign_secret. In version `2.10.x`, the variable `secret` is a random string initialized when the broker starts. ### Issue 1 If a cluster includes two brokers, and a topic `public/default/tp1` is owned by broker-0. We will get an error when we call `pulsar-admin topics stats public/default/tp1` to broker-1. The whole process goes like this: - client succeeds in authentication and gets a token from broker-1 - broker-1 tells the client to redirect to broker-0 - client request to broker-0 carries the sasl_role_token generated by broker-1 - broker-0 can not decode the sasl_role_token, because it has differ secret of broker-1, and responses 401 ### Issue 2 After authentication is enabled for both Pulsar-Proxy and Pulsar-Broker, the error occurs as follows - client succeeds in authentication and gets a token from Pulsar Proxy - proxy forwards the request to broker - the broker can not decode the `sasl_role_token`, because it has differed secret of Pulsar Proxy, and responses 401 ### solutions There have two solutions to solve this issue: Solution 1 - The client saves different tokens for different servers(e.g. ["broker-0", "broker-1", "pulsar-proxy"]) so servers will receive the tokens issued by each other, then we can fix Issue 1. - Proxy and Broker do not enable authentication simultaneously, then we can fix Issue 2. Solution 2 - Make `sasl_sign_secret` configurable. Users can configure this variable to the same value, then multi servers can decode every `sasl_role_token.` PR #15121 does this. I'd prefer Solution 2 because it is already in the master branch, so I want to cherry-pick #15121 into branch-2.10. ### Forward Compatibility In PR #15121, the config `sasl_sign_secret` is a new item in config files. Since it is required, users will get a system error if does not set it. To ensure forward compatibility, we can make this variable optional in branch-2.10 Thanks Yubiao Feng