No, at-rest encryption is definitely important. When you start talking about data that is used for financial reporting, restricting access to it (both modification and visibility) is a critical component.
-Todd On 6/5/14, 2:01 PM, "Jay Kreps" <jay.kr...@gmail.com> wrote: >Hey Joe, > >I don't really understand the sections you added to the wiki. Can you >clarify them? > >Is non-repudiation what SASL would call integrity checks? If so don't SSL >and and many of the SASL schemes already support this as well as >on-the-wire encryption? > >Or are you proposing an on-disk encryption scheme? Is this actually >needed? >Isn't a on-the-wire encryption when combined with mutual authentication >and >permissions sufficient for most uses? > >On-disk encryption seems unnecessary because if an attacker can get root >on >the kafka boxes it can potentially modify Kafka to do anything he or she >wants with data. So this seems to break any security model. > >I understand the problem of a large organization not really having a >trusted network and wanting to secure data transfer and limit and audit >data access. The uses for these other things I don't totally understand. > >Also it would be worth understanding the state of other messaging and >storage systems (Hadoop, dbs, etc). What features do they support. I think >there is a sense in which you don't have to run faster than the bear, but >only faster then your friends. :-) > >-Jay > > >On Wed, Jun 4, 2014 at 5:57 PM, Joe Stein <joe.st...@stealth.ly> wrote: > >> I like the idea of working on the spec and prioritizing. I will update >>the >> wiki. >> >> - Joestein >> >> >> On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps <jay.kr...@gmail.com> wrote: >> >> > Hey Joe, >> > >> > Thanks for kicking this discussion off! I totally agree that for >> something >> > that acts as a central message broker security is critical feature. I >> think >> > a number of people have been interested in this topic and several >>people >> > have put effort into special purpose security efforts. >> > >> > Since most the LinkedIn folks are working on the consumer right now I >> think >> > this would be a great project for any other interested people to take >>on. >> > There are some challenges in doing these things distributed but it can >> also >> > be a lot of fun. >> > >> > I think a good first step would be to get a written plan we can all >>agree >> > on for how things should work. Then we can break things down into >>chunks >> > that can be done independently while still aiming at a good end state. >> > >> > I had tried to write up some notes that summarized at least the >>thoughts >> I >> > had had on security: >> > https://cwiki.apache.org/confluence/display/KAFKA/Security >> > >> > What do you think of that? >> > >> > One assumption I had (which may be incorrect) is that although we want >> all >> > the things in your list, the two most pressing would be authentication >> and >> > authorization, and that was all that write up covered. You have more >> > experience in this domain, so I wonder how you would prioritize? >> > >> > Those notes are really sketchy, so I think the first goal I would have >> > would be to get to a real spec we can all agree on and discuss. A lot >>of >> > the security stuff has a high human interaction element and needs to >>work >> > in pretty different domains and different companies so getting this >>kind >> of >> > review is important. >> > >> > -Jay >> > >> > >> > On Tue, Jun 3, 2014 at 12:57 PM, Joe Stein <joe.st...@stealth.ly> >>wrote: >> > >> > > Hi,I wanted to re-ignite the discussion around Apache Kafka >>Security. >> > This >> > > is a huge bottleneck (non-starter in some cases) for a lot of >> > organizations >> > > (due to regulatory, compliance and other requirements). Below are my >> > > suggestions for specific changes in Kafka to accommodate security >> > > requirements. This comes from what folks are doing "in the wild" to >> > > workaround and implement security with Kafka as it is today and also >> > what I >> > > have discovered from organizations about their blockers. It also >>picks >> up >> > > from the wiki (which I should have time to update later in the week >> based >> > > on the below and feedback from the thread). >> > > >> > > 1) Transport Layer Security (i.e. SSL) >> > > >> > > This also includes client authentication in addition to in-transit >> > security >> > > layer. This work has been picked up here >> > > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate >>any >> > > thoughts, comments, feedback, tomatoes, whatever for this patch. It >> is a >> > > pickup from the fork of the work first done here >> > > https://github.com/relango/kafka/tree/kafka_security. >> > > >> > > 2) Data encryption at rest. >> > > >> > > This is very important and something that can be facilitated within >>the >> > > wire protocol. It requires an additional map data structure for the >> > > "encrypted [data encryption key]". With this map (either in your >>object >> > or >> > > in the wire protocol) you can store the dynamically generated >>symmetric >> > key >> > > (for each message) and then encrypt the data using that dynamically >> > > generated key. You then encrypt the encryption key using each >>public >> key >> > > for whom is expected to be able to decrypt the encryption key to >>then >> > > decrypt the message. For each public key encrypted symmetric key >> (which >> > is >> > > now the "encrypted [data encryption key]" along with which public >>key >> it >> > > was encrypted with for (so a map of [publicKey] = >> > > encryptedDataEncryptionKey) as a chain. Other patterns can be >> > implemented >> > > but this is a pretty standard digital enveloping [0] pattern with >>only >> 1 >> > > field added. Other patterns should be able to use that field to-do >> their >> > > implementation too. >> > > >> > > 3) Non-repudiation and long term non-repudiation. >> > > >> > > Non-repudiation is proving data hasn't changed. This is often (if >>not >> > > always) done with x509 public certificates (chained to a certificate >> > > authority). >> > > >> > > Long term non-repudiation is what happens when the certificates of >>the >> > > certificate authority are expired (or revoked) and everything ever >> signed >> > > (ever) with that certificate's public key then becomes "no longer >> > provable >> > > as ever being authentic". That is where RFC3126 [1] and RFC3161 [2] >> come >> > > in (or worm drives [hardware], etc). >> > > >> > > For either (or both) of these it is an operation of the encryptor to >> > > sign/hash the data (with or without third party trusted timestap of >>the >> > > signing event) and encrypt that with their own private key and >> distribute >> > > the results (before and after encrypting if required) along with >>their >> > > public key. This structure is a bit more complex but feasible, it >>is a >> > map >> > > of digital signature formats and the chain of dig sig attestations. >> The >> > > map's key being the method (i.e. CRC32, PKCS7 [3], XmlDigSig [4]) >>and >> > then >> > > a list of map where that key is "purpose" of signature (what your >> > attesting >> > > too). As a sibling field to the list another field for "the >>attester" >> as >> > > bytes (e.g. their PKCS12 [5] for the map of PKCS7 signatures). >> > > >> > > 4) Authorization >> > > >> > > We should have a policy of "404" for data, topics, partitions (etc) >>if >> > > authenticated connections do not have access. In "secure mode" any >>non >> > > authenticated connections should get a "404" type message on >> everything. >> > > Knowing "something is there" is a security risk in many uses cases. >> So >> > if >> > > you don't have access you don't even see it. Baking "that" into >>Kafka >> > > along with some interface for entitlement (access management) >>systems >> > > (pretty standard) is all that I think needs to be done to the core >> > project. >> > > I want to tackle item later in the year after summer after the >>other >> > three >> > > are complete. >> > > >> > > I look forward to thoughts on this and anyone else interested in >> working >> > > with us on these items. >> > > >> > > [0] >> > > >> > > >> > >> >>http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/what-is-a-digi >>tal-envelope.htm >> > > [1] http://tools.ietf.org/html/rfc3126 >> > > [2] http://tools.ietf.org/html/rfc3161 >> > > [3] >> > > >> > > >> > >> >>http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/pkcs-7-cryptog >>raphic-message-syntax-standar.htm >> > > [4] http://en.wikipedia.org/wiki/XML_Signature >> > > [5] http://en.wikipedia.org/wiki/PKCS_12 >> > > >> > > /******************************************* >> > > Joe Stein >> > > Founder, Principal Consultant >> > > Big Data Open Source Security LLC >> > > http://www.stealth.ly >> > > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> >> > > ********************************************/ >> > > >> > >>