Hi Colin, Thanks for the response. I chose raw records, thinking it might be useful for future additions of records that customers might want to add before the first start of the cluster. I do see that it is at best an engineer friendly interface.
I do think kafka-storage is the correct place to put the logic for adding records to the bootstrap.checkpoint file. I think keeping the logic for managing the bootstrap separate from the logic of configuring an existing cluster that is already running is a good division of functionality and I think this separation will reduce the parsing logic significantly. The API suggestion you made for kafka-storage is okay. I would prefer to have one option for an entire SCRAM config including the user, such as the following: ./bin/kafka-storage.sh format \ --config [my-config-path] \ --cluster-id mb0Zz1YPTUeVzpedHHPT-Q \ --release-version 3.5-IV0 \ --scram-config user=alice 'SCRAM-SHA-256=[iterations=8192,password=alicepass]' \ --scram-config user=bob 'SCRAM-SHA-256=[password=bobpass]' Argparse4j supports multiple option arguments to a single option including having an optional number of option arguments to a single option. I think adding the Argparse4j support for reading the arguments from a file is a must. --Proven On Thu, Jan 19, 2023 at 7:07 PM Colin McCabe <cmcc...@apache.org> wrote: > Hi Proven, > > Thanks for putting this together. > > We always intended to have a way to bootstrap into using an all-SCRAM > cluster, from scratch. > > I have two big comments here. First, I think we need a better interface > than raw records. And second, I'm not sure that kafka-storage.sh is the > right place to put this. > > I think raw records are going to be tough for people to use, because there > are a lot of fields, and the values to set them to are not intuitive. For > example, to set SHA512, the user needs to set "mechanism" equal to 2. That > is going to be impossible to remember or figure out without looking at the > source code. The other thing of course is that we may add more fields over > time, including mandatory ones. So any documentation could quickly get out > of date. > > I think people are going to want to specify SCRAM users here the same way > they do when using the kafka-configs.sh tool. As a reminder, using > kafka-configs.sh, they specify users like this: > > ./bin/kafka-configs --bootstrap-server localhost:9092 --alter \ > --add-config 'SCRAM-SHA-256=[iterations=8192,password=pass]' \ > --entity-type users \ > --entity-name alice > > Of course, in this example, we're not specifying a salt. So we'd have to > evaluate whether that's what we want for our use-case as well. On the plus > side, specifying a salt could ensure that the bootstrap files end up > identical on every node. On the minus side, it is another random number > that users would need to generate and explicitly pass in. > > I would lean towards auto-generating the salt. I don't think the salt > needs to be the same on all nodes. Only one controller will become active > and write the bootstrap records to the log; no other controllers will do > that. Brokers don't need to read the SCRAM records out of the bootstrap > file. > > If we put all the functionality into kafka-storage.sh, it might look > something like this: > > ./bin/kafka-storage.sh format \ > --config [my-config-path] \ > --cluster-id mb0Zz1YPTUeVzpedHHPT-Q \ > --release-version 3.5-IV0 \ > --scram-user alice \ > --scram-config 'SCRAM-SHA-256=[iterations=8192,password=alicepass]' \ > --scram-user bob \ > --scram-config 'SCRAM-SHA-256=[password=bobpass]' > > (Here I am assuming that each --scram-user must be followed by exactly on > --scram-config line) > > Perhaps it's worth considering whether it woudl be better to add a mode to > kafka-configs.sh where it appends to a bootstrap file. > > If we do put everything into kafka-storage.sh, we should consider the > plight of people with low limits on the maximum length of their command > lines. One fix for these people could be allowing them to read their > arguments from a file like this: > > $ ./bin/kafka-storage.sh @myfile > $ cat myfile: > ./bin/kafka-storage.sh format \ > --config [my-config-path] \ > ... > [etc, etc.] > > Argparse4j supports this natively with fromFilePrefix. See > https://argparse4j.github.io/usage.html#fromfileprefix > > best, > Colin > > > On Thu, Jan 19, 2023, at 11:08, Proven Provenzano wrote: > > I have written a KIP describing the API additions needed to > > kafka-storage > > to store SCRAM > > credentials at bootstrap time. Please take a look at > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-900%3A+KRaft+kafka-storage.sh+API+additions+to+support+SCRAM+for+Kafka+Brokers > > > > -- > > --Proven >