[ https://issues.apache.org/jira/browse/KAFKA-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167800#comment-15167800 ]
Parth Brahmbhatt commented on KAFKA-1696: ----------------------------------------- So here is how that request path would work in my mind: * Client sends request for token acquisition to any broker. * Broker forwards the request to the controller. * Controller generates the token and pushes the tokens to all brokers. (Will need a new API) * Controller responds back to original broker with the token. * Broker responds back to client with the token. Renewal is pretty much the same. The race condition you are describing can still happen in the above case during renewal because controller may have pushed the renewal information to a subset of broker and die. The clients depending on which broker it connects to may get an exception or success. I do agree though that given controller would not have responded back with success the original renew request should be retried and most likely the scenario can be avoided. If the above steps seems right , here are the advantages of this approach: Advantage: * Token generation/renewal will not involve zookeeper. I am not too worried about the load on zookeeper added due to this but it definitely seems more secure and follows the Hadoop model more closely. However zookeeper needs to be secure for lot of other things in kafka so not sure if this should really be a concern. Disadvantage: * We will have to add new APIs to support controller pushing tokens to brokers on top of the minimal APIs that are currently proposed. I like the publicly available APIs to be minimal and I like them to be something that we expect clients to use + this adds more development complexity. Overall this seems like a more philosophical thing so depending on who you ask they may see this as disadvantage or not. * We will also have to add APIs to support the bootstrapping case. What I mean is , when a new broker comes up it will have to get all delegation tokens from the controller so we will again need to add new APIs like getAllTokens. Again some of us may see that as disadvantage and some may not. * In catastrophic failures where all brokers go down, the tokens will be lost even if servers are restarted as tokens are not persisted anywhere. Granted if something like this happens customer has bigger things to worry about but if they don't have to regenerate/redistribute tokens that is one less thing. I don't see strong reasons to go one way or another so I would still like to go with zookeeper but don't really feel strongly about it. If you think I have mischaracterized what you were proposing feel free to add more details or list and other advantages/disadvantages. > Kafka should be able to generate Hadoop delegation tokens > --------------------------------------------------------- > > Key: KAFKA-1696 > URL: https://issues.apache.org/jira/browse/KAFKA-1696 > Project: Kafka > Issue Type: Sub-task > Components: security > Reporter: Jay Kreps > Assignee: Parth Brahmbhatt > > For access from MapReduce/etc jobs run on behalf of a user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)