Hi Colin, So I think we are converging where we put the adding of records in kafka-storage, allow for salt and iterations to be optional and now we are getting to how to parse the individual arguments.
My previous approach is to keep '--' as a delimiter for primary kafka-storage arguments but to allow multiple space separated sub args for each 'scram-config' can contain multiple space separated key=value pairs. --scram-config user=alice 'SCRAM-SHA-256=[iterations=8192,password=alicepass] It could also be like the following with sub parsing of arguments where name requires an additional arg and config requires one too. This looks more like kafka-configs. --scram-config entity-name alice config 'SCRAM-SHA-256=[ iterations=8192,password=alicepass] If we really want to keep a similar parsing to kafka-configs we could use '--alter' as a primary argument like --config and --cluster-id and the other args are secondary arguments to --alter . The following now looks almost exactly like kafka-configs. We would just ignore --entity-type as it must be of type user. --alter --add-config 'SCRAM-SHA-256=[iterations=8192,password=alice-secret],SCRAM-SHA-512=[password=alice-secret]' --entity-type users --entity-name alice Thoughts? --Proven On Tue, Jan 24, 2023 at 12:22 AM Colin McCabe <cmcc...@apache.org> wrote: > On Fri, Jan 20, 2023, at 13:02, Proven Provenzano wrote: > > Hi Colin, > > Thanks for the response. > > > > I chose raw records, thinking it might be useful for future additions of > > records that customers might want to add before the first start of the > > cluster. I do see that it is at best an engineer friendly interface. > > > > I do think kafka-storage is the correct place to put the logic for adding > > records to the bootstrap.checkpoint file. I think keeping the logic for > > managing the bootstrap separate from the logic of configuring an existing > > cluster that is already running is a good division of functionality and I > > think this separation will reduce the parsing logic significantly. > > Hi Proven, > > I suspect having the separation will increase the parsing logic you have > to write rather than reducing it, since it will be more difficult to reuse > the logic that is already in kafka-configs.sh. > > I do agree that having multiple commands modifying the bootstrap may be > confusing, though. So maybe it's best to keep it all in the formatting > tool, as you say. > > > > > The API suggestion you made for kafka-storage is okay. I would prefer to > > have one option for an entire SCRAM config including the user, such as > the > > following: > > > > Hmm... I don't think having a single giant string for all the SCRAM > configurations makes sense. At minimum each SCRAM user should get its own > string. Another reason for doing it this way is that this is how > kafka-configs.sh works, and we don't want two different incompatible ways > of specifying SCRAM users on the command line. > > > I think adding the Argparse4j support for reading the arguments from a > file > > is a must. > > Yeah, agreed. > > best, > Colin > > > > > --Proven > > > > > > On Thu, Jan 19, 2023 at 7:07 PM Colin McCabe <cmcc...@apache.org> wrote: > > > >> Hi Proven, > >> > >> Thanks for putting this together. > >> > >> We always intended to have a way to bootstrap into using an all-SCRAM > >> cluster, from scratch. > >> > >> I have two big comments here. First, I think we need a better interface > >> than raw records. And second, I'm not sure that kafka-storage.sh is the > >> right place to put this. > >> > >> I think raw records are going to be tough for people to use, because > there > >> are a lot of fields, and the values to set them to are not intuitive. > For > >> example, to set SHA512, the user needs to set "mechanism" equal to 2. > That > >> is going to be impossible to remember or figure out without looking at > the > >> source code. The other thing of course is that we may add more fields > over > >> time, including mandatory ones. So any documentation could quickly get > out > >> of date. > >> > >> I think people are going to want to specify SCRAM users here the same > way > >> they do when using the kafka-configs.sh tool. As a reminder, using > >> kafka-configs.sh, they specify users like this: > >> > >> ./bin/kafka-configs --bootstrap-server localhost:9092 --alter \ > >> --add-config 'SCRAM-SHA-256=[iterations=8192,password=pass]' \ > >> --entity-type users \ > >> --entity-name alice > >> > >> Of course, in this example, we're not specifying a salt. So we'd have to > >> evaluate whether that's what we want for our use-case as well. On the > plus > >> side, specifying a salt could ensure that the bootstrap files end up > >> identical on every node. On the minus side, it is another random number > >> that users would need to generate and explicitly pass in. > >> > >> I would lean towards auto-generating the salt. I don't think the salt > >> needs to be the same on all nodes. Only one controller will become > active > >> and write the bootstrap records to the log; no other controllers will do > >> that. Brokers don't need to read the SCRAM records out of the bootstrap > >> file. > >> > >> If we put all the functionality into kafka-storage.sh, it might look > >> something like this: > >> > >> ./bin/kafka-storage.sh format \ > >> --config [my-config-path] \ > >> --cluster-id mb0Zz1YPTUeVzpedHHPT-Q \ > >> --release-version 3.5-IV0 \ > >> --scram-user alice \ > >> --scram-config 'SCRAM-SHA-256=[iterations=8192,password=alicepass]' \ > >> --scram-user bob \ > >> --scram-config 'SCRAM-SHA-256=[password=bobpass]' > >> > >> (Here I am assuming that each --scram-user must be followed by exactly > on > >> --scram-config line) > >> > >> Perhaps it's worth considering whether it woudl be better to add a mode > to > >> kafka-configs.sh where it appends to a bootstrap file. > >> > >> If we do put everything into kafka-storage.sh, we should consider the > >> plight of people with low limits on the maximum length of their command > >> lines. One fix for these people could be allowing them to read their > >> arguments from a file like this: > >> > >> $ ./bin/kafka-storage.sh @myfile > >> $ cat myfile: > >> ./bin/kafka-storage.sh format \ > >> --config [my-config-path] \ > >> ... > >> [etc, etc.] > >> > >> Argparse4j supports this natively with fromFilePrefix. See > >> https://argparse4j.github.io/usage.html#fromfileprefix > >> > >> best, > >> Colin > >> > >> > >> On Thu, Jan 19, 2023, at 11:08, Proven Provenzano wrote: > >> > I have written a KIP describing the API additions needed to > >> > kafka-storage > >> > to store SCRAM > >> > credentials at bootstrap time. Please take a look at > >> > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-900%3A+KRaft+kafka-storage.sh+API+additions+to+support+SCRAM+for+Kafka+Brokers > >> > > >> > -- > >> > --Proven > >> >