Gwen, There is one product called ElasticSearch which has been quite successful. They recently added security, what they actually did is quite nice. They really separated Authentication and Authorization which many people get really confused about and often mix them up. I looked through what they did and quite impressed by it, I think there are many things we can borrow from. Here is a link to it. http://www.elastic.co/guide/en/shield/current/architecture.html. The product name is called "shield" which is implemented as an ElasticSearch plugin. The promise here is that you can have a running ElasticSearch, then you install this plugin, configure it, then your ElasticSearch service is secured. The goal should be really the same for Kafka, you have a Kafka service running, you install a new plugin (in this case security plugin), configure it, then your Kafka service is secured. I think that the key here is that we should introduce a true pluggable framework in Kafka which allows security, quota, encryption, compression, serialization/deserialization all being developed as plugins which can be all easily added and configured onto a running Kafka service, then the functions/features provided by the plugins will start working. Once we have this framework in, how a security plugin works internally becomes the really the concern of that plugin, for example, how a new user gets registered, permission granted, revoked, all these will be the concern of that plugin, rest of the Kafka components should not really be concerned about them. This way we are really following the design principal (Separation of concerns). With all that, what I am proposing is a true pluggable framework introduction into Kafka which I have also talked about in a previous email. For security we can implement a simple file based security plugin, other plugins such as LDAP, AD for authentication can come later, plugin for authorization such as RBAC can also come later if people care so much about using them.
Thanks. Tong Li OpenStack & Kafka Community Development Building 501/B205 liton...@us.ibm.com From: Gwen Shapira <gshap...@cloudera.com> To: "dev@kafka.apache.org" <dev@kafka.apache.org> Date: 04/16/2015 12:44 PM Subject: [DISCUSSION] KIP-11: ACL Management Hi Kafka Authorization Fans, I'm starting a new thread on a specific sub-topic of KIP-11, since this is a bit long :) Currently KIP-11, as I understand it, proposes: * Authorizers are pluggable, with Kafka providing DefaultAuthorizer. * Kafka tools allow adding / managing ACLs. * Those ACLs are stored in ZK and cached in a new TopicCache * Authorizers can either use the ACLs defined and stored in Kafka, or define and use their own. I am concerned of two possible issues with this design: 1. Separation of concerns - only authorizers should worry about ACLs, and therefore the less code for ACLs that exist in Kafka core, the better. 2. User confusion - It sounded like we can define ACLs in Kafka itself but authorizers can also define their own, so "kafka-topics --describe" may show an ACL different than the one in use. This can be super confusing for admins. My alternative suggestion: * Authorizer API will include: grantPrivilege(List<Principals>, List<Privilege>) revokePrivilege(List<Principals>, List<Privilege>), getPrivilegesByPrincipal(Principal, Resource) .... (The exact API can be discussed in detail, but you get the idea) * Kafka tools will simply invoke these APIs when topics are added / modified / described. * Each authorizer (including the default one) will be responsible for storing, caching and using those ACLs. This way, we keep almost all ACL code with the Authorizer, where it belongs and users get a nice unified interface that reflects what is actually getting used in the system. This is pretty much how Sqoop and Hive implement their authorization APIs. What do you think? Gwen