Thanks Joe for this, I cloned this branch and tried to run zookeeper but I get
Error: Could not find or load main class org.apache.zookeeper.server.quorum.QuorumPeerMain I see scala version is still set to 2.8.0 if [ -z "$SCALA_VERSION" ]; then SCALA_VERSION=2.8.0 fi Then I installed sbt and scala and followed your instructions for different scala versions. I was able to bring zookeeper up but brokers fail to start with error Error: Could not find or load main class kafka.Kafka I think I am doing something wrong. Can you please help me? Our current production setup is with 2.8.0 and want to stick to it. Thanks, Pramod On Tue, Jun 3, 2014 at 3:57 PM, Joe Stein <joe.st...@stealth.ly> wrote: > Hi,I wanted to re-ignite the discussion around Apache Kafka Security. This > is a huge bottleneck (non-starter in some cases) for a lot of organizations > (due to regulatory, compliance and other requirements). Below are my > suggestions for specific changes in Kafka to accommodate security > requirements. This comes from what folks are doing "in the wild" to > workaround and implement security with Kafka as it is today and also what I > have discovered from organizations about their blockers. It also picks up > from the wiki (which I should have time to update later in the week based > on the below and feedback from the thread). > > 1) Transport Layer Security (i.e. SSL) > > This also includes client authentication in addition to in-transit security > layer. This work has been picked up here > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate any > thoughts, comments, feedback, tomatoes, whatever for this patch. It is a > pickup from the fork of the work first done here > https://github.com/relango/kafka/tree/kafka_security. > > 2) Data encryption at rest. > > This is very important and something that can be facilitated within the > wire protocol. It requires an additional map data structure for the > "encrypted [data encryption key]". With this map (either in your object or > in the wire protocol) you can store the dynamically generated symmetric key > (for each message) and then encrypt the data using that dynamically > generated key. You then encrypt the encryption key using each public key > for whom is expected to be able to decrypt the encryption key to then > decrypt the message. For each public key encrypted symmetric key (which is > now the "encrypted [data encryption key]" along with which public key it > was encrypted with for (so a map of [publicKey] = > encryptedDataEncryptionKey) as a chain. Other patterns can be implemented > but this is a pretty standard digital enveloping [0] pattern with only 1 > field added. Other patterns should be able to use that field to-do their > implementation too. > > 3) Non-repudiation and long term non-repudiation. > > Non-repudiation is proving data hasn't changed. This is often (if not > always) done with x509 public certificates (chained to a certificate > authority). > > Long term non-repudiation is what happens when the certificates of the > certificate authority are expired (or revoked) and everything ever signed > (ever) with that certificate's public key then becomes "no longer provable > as ever being authentic". That is where RFC3126 [1] and RFC3161 [2] come > in (or worm drives [hardware], etc). > > For either (or both) of these it is an operation of the encryptor to > sign/hash the data (with or without third party trusted timestap of the > signing event) and encrypt that with their own private key and distribute > the results (before and after encrypting if required) along with their > public key. This structure is a bit more complex but feasible, it is a map > of digital signature formats and the chain of dig sig attestations. The > map's key being the method (i.e. CRC32, PKCS7 [3], XmlDigSig [4]) and then > a list of map where that key is "purpose" of signature (what your attesting > too). As a sibling field to the list another field for "the attester" as > bytes (e.g. their PKCS12 [5] for the map of PKCS7 signatures). > > 4) Authorization > > We should have a policy of "404" for data, topics, partitions (etc) if > authenticated connections do not have access. In "secure mode" any non > authenticated connections should get a "404" type message on everything. > Knowing "something is there" is a security risk in many uses cases. So if > you don't have access you don't even see it. Baking "that" into Kafka > along with some interface for entitlement (access management) systems > (pretty standard) is all that I think needs to be done to the core project. > I want to tackle item later in the year after summer after the other three > are complete. > > I look forward to thoughts on this and anyone else interested in working > with us on these items. > > [0] > > http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/what-is-a-digital-envelope.htm > [1] http://tools.ietf.org/html/rfc3126 > [2] http://tools.ietf.org/html/rfc3161 > [3] > > http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/pkcs-7-cryptographic-message-syntax-standar.htm > [4] http://en.wikipedia.org/wiki/XML_Signature > [5] http://en.wikipedia.org/wiki/PKCS_12 > > /******************************************* > Joe Stein > Founder, Principal Consultant > Big Data Open Source Security LLC > http://www.stealth.ly > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> > ********************************************/ >