The need for a large number of clients that are running all over the cluster that authenticate with Kafka brokers, is very similar to the Hadoop use case of large number of tasks running across the cluster that need authentication to Hdfs Namenode. Therefore, the delegation token approach does seem like a good fit for this use case as we have seen it working at large scale in HDFS and YARN.
The proposed design is very much inline with Hadoop approach. A few comments: 1) Why do you guys want to allow infinite renewable lifetime for a token? HDFS restricts a token to a max life time (default 7 days). A token's vulnerability is believed to increase with time. 2) As I understand the tokens are stored in zookeeper as well, and can be updated there. This is clever as it can allow replacing the tokens once they run out of max life time, and clients can download new tokens from zookeeper. It shouldn't be a big load on zookeeper as a client will need to get a new token once in several days. In this approach you don't need infinite lifetime on the token even for long running clients. 3) The token password are generated using a master key. The master key should also be periodically changed. In Hadoop, the default renewal period is 1 day.? Thanks for a thorough proposal, great work! ?