Hi Proven,

Thanks for putting this together.

We always intended to have a way to bootstrap into using an all-SCRAM cluster, 
from scratch.

I have two big comments here. First, I think we need a better interface than 
raw records. And second, I'm not sure that kafka-storage.sh is the right place 
to put this.

I think raw records are going to be tough for people to use, because there are 
a lot of fields, and the values to set them to are not intuitive. For example, 
to set SHA512, the user needs to set "mechanism" equal to 2. That is going to 
be impossible to remember or figure out without looking at the source code. The 
other thing of course is that we may add more fields over time, including 
mandatory ones. So any documentation could quickly get out of date.

I think people are going to want to specify SCRAM users here the same way they 
do when using the kafka-configs.sh tool. As a reminder, using kafka-configs.sh, 
they specify users like this:

./bin/kafka-configs --bootstrap-server localhost:9092 --alter \
  --add-config 'SCRAM-SHA-256=[iterations=8192,password=pass]' \
  --entity-type users \
  --entity-name alice

Of course, in this example, we're not specifying a salt. So we'd have to 
evaluate whether that's what we want for our use-case as well. On the plus 
side, specifying a salt could ensure that the bootstrap files end up identical 
on every node. On the minus side, it is another random number that users would 
need to generate and explicitly pass in.

I would lean towards auto-generating the salt. I don't think the salt needs to 
be the same on all nodes. Only one controller will become active and write the 
bootstrap records to the log; no other controllers will do that. Brokers don't 
need to read the SCRAM records out of the bootstrap file.

If we put all the functionality into kafka-storage.sh, it might look something 
like this:

./bin/kafka-storage.sh format \
  --config [my-config-path] \
  --cluster-id mb0Zz1YPTUeVzpedHHPT-Q \
  --release-version 3.5-IV0 \
  --scram-user alice \
  --scram-config 'SCRAM-SHA-256=[iterations=8192,password=alicepass]' \
  --scram-user bob \
  --scram-config 'SCRAM-SHA-256=[password=bobpass]'

(Here I am assuming that each --scram-user must be followed by exactly on 
--scram-config line)

Perhaps it's worth considering whether it woudl be better to add a mode to 
kafka-configs.sh where it appends to a bootstrap file.

If we do put everything into kafka-storage.sh, we should consider the plight of 
people with low limits on the maximum length of their command lines. One fix 
for these people could be allowing them to read their arguments from a file 
like this:

$ ./bin/kafka-storage.sh @myfile
$ cat myfile:
  ./bin/kafka-storage.sh format \
    --config [my-config-path] \
  ...
[etc, etc.]

Argparse4j supports this natively with fromFilePrefix. See 
https://argparse4j.github.io/usage.html#fromfileprefix

best,
Colin


On Thu, Jan 19, 2023, at 11:08, Proven Provenzano wrote:
> I have written a KIP describing the API additions needed to 
> kafka-storage
> to store SCRAM
> credentials at bootstrap time. Please take a look at
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-900%3A+KRaft+kafka-storage.sh+API+additions+to+support+SCRAM+for+Kafka+Brokers
>
> -- 
> --Proven

Reply via email to