No, the Km does not need to be the same across nodes. Each node can
store their own encryption info file created by their own Km. The
streaming process only requires the Kr is shared.
A quick description of the streaming process via an insecure connection:
1. the sender unwrap the wrapped key WKr with their Km, and get the key Kr
2. the sender and the receiver use DH key exchange to establish a shared
secret Ks, so that sender and receiver both know the Ks
3. the sender derives a KEKs from the table info (SSTable gen is not
persisted across nodes) & streaming info (TODO) and the shared secret
Ks, so KEKs = KDF(Table UUID + TBD STREAMING INFO, Ks)
4. the sender wraps the Kr with KEKs to get WKrs = KW(Kr, KEKs)
5. the sender sends WKrs and the (encrypted) SSTable file to the receiver
6. the receiver derives the KEKs in the same way as the sender
7. the receiver unwraps WKrs using the the KEKs and get Kr
8. the receiver wraps the Kr with a KEK' derived from their own Km
This enables zero-copy streaming, and the Kr is never send in plaintext
over an insecure communication channel. An passive observer cannot learn
anything about the Kr. If the streaming is done over TLS, the Kr can be
send over a TLS connection without all the additional work. The SSTable
can be send via insecure connection to enable zero-copy streaming. An
HMAC of the SSTable should also be send over TLS to ensure the SSTable
has not been damaged or modified.
On 16/11/2021 10:45, Stefan Miklosovic wrote:
Ok but this also means that Km would need to be the same for all nodes right?
If we are rolling in node by node fashion, Km is changed at node 1, we
change the wrapped key which is stored on disk and we stream this
table to the other node which is still on the old Km. Would this work?
I think we would need to rotate first before anything is streamed. Or
no?
On Tue, 16 Nov 2021 at 11:17, Bowen Song <bo...@bso.ng.invalid> wrote:
Yes, that's correct. The actual key used to encrypt the SSTable will
stay the same once the SSTable is created. This is a widely used
practice in many encrypt-at-rest applications. One good example is the
LUKS full disk encryption, which also supports multiple keys to unlock
(decrypt) the same data. Multiple unlocking keys is only possible
because the actual key used to encrypt the data is randomly generated
and then stored encrypted by (a key derived from) a user chosen key.
If this approach is adopted, the streaming process can share the Kr
without disclosing the Km, therefore enableling zero-copy streaming.
On 16/11/2021 08:56, Stefan Miklosovic wrote:
Hi Bowen, Very interesting idea indeed. So if I got it right, the very
key for the actual sstable encryption would be always the same, it is
just what is wrapped would differ. So if we rotate, we basically only
change Km hence KEK hence the result of wrapping but there would still
be the original Kr key used.
Jeremiah - I will prepare that branch very soon.
On Tue, 16 Nov 2021 at 01:09, Bowen Song <bo...@bso.ng.invalid> wrote:
The second question is about key rotation. If an operator needs to
roll the key because it was compromised or there is some policy around
that, we should be able to provide some way to rotate it. Our idea is
to write a tool (either a subcommand of nodetool (rewritesstables)
command or a completely standalone one in tools) which would take the
first, original key, the second, new key and dir with sstables as
input and it would literally took the data and it would rewrite it to
the second set of sstables which would be encrypted with the second
key. What do you think about this?
I would rather suggest that “what key encrypted this” be part of the
sstable metadata, and allow there to be multiple keys in the system. This way
you can just add a new “current key” so new sstables use the new key, but
existing sstables would use the old key. An operator could then trigger a
“nodetool upgradesstables —all” to rewrite the existing sstables with the new
“current key”.
There's a much better approach to solve this issue. You can stored a
wrapped key in an encryption info file alone side the SSTable file.
Here's how it works:
1. randomly generate a key Kr
2. encrypt the SSTable file with the key Kr, store the encrypted SSTable
file on disk
3. derive a key encryption key KEK from the SSTable file's information
(e.g.: table UUID + generation) and the user chosen master key Km, so
you have KEK = KDF(UUID+GEN, Km)
4. wrap (encrypt) the key Kr with the KEK, so you have WKr = KW(Kr, KEK)
5. hash the Km, the hash will used as a key ID to identify which master
key was used to encrypt the key Kr if the server has multiple master
keys in use
6. store the the WKr and the hash of Km in a separate file alone side
the SSTable file
In the read path, the Kr should be kept in memory to help improve
performance and this will also allow zero-downtime master key rotation.
During a key rotation:
1. derive the KEK in the same way: KEK = KDF(UUID+GEN, Km)
2. read the WKr from the encryption information file, and unwrap
(decrypt) it using the KEK to get the Kr
3. derive a new KEK' from the new master key Km' in the same way as above
4. wrap (encrypt) the key Kr with KEK' to get WKr' = KW(Kr, KEK')
5. hash the new master key Km', and store it together with the WKr' in
the encryption info file
Since the key rotation only involves rewriting the encryption info file,
the operation should take only a few milliseconds per SSTable file, it
will be much faster than decrypting and then re-encrypting the SSTable data.
On 15/11/2021 18:42, Jeremiah D Jordan wrote:
On Nov 14, 2021, at 3:53 PM, Stefan
Miklosovic<stefan.mikloso...@instaclustr.com> wrote:
Hey,
there are two points we are not completely sure about.
The first one is streaming. If there is a cluster of 5 nodes, each
node has its own unique encryption key. Hence, if a SSTable is stored
on a disk with the key for node 1 and this is streamed to node 2 -
which has a different key - it would not be able to decrypt that. Our
idea is to actually send data over the wire _decrypted_ however it
would be still secure if internode communication is done via TLS. Is
this approach good with you?
So would you fail startup if someone enabled sstable encryption but did not
have TLS for internode communication? Another concern here is making sure zero
copy streaming does not get triggered for this case.
Have you considered having some way to distribute the keys to all nodes such
that you don’t need to decrypt on the sending side? Having to do this will
mean a lot more overhead for the sending side of a streaming operation.
The second question is about key rotation. If an operator needs to
roll the key because it was compromised or there is some policy around
that, we should be able to provide some way to rotate it. Our idea is
to write a tool (either a subcommand of nodetool (rewritesstables)
command or a completely standalone one in tools) which would take the
first, original key, the second, new key and dir with sstables as
input and it would literally took the data and it would rewrite it to
the second set of sstables which would be encrypted with the second
key. What do you think about this?
I would rather suggest that “what key encrypted this” be part of the sstable
metadata, and allow there to be multiple keys in the system. This way you can
just add a new “current key” so new sstables use the new key, but existing
sstables would use the old key. An operator could then trigger a “nodetool
upgradesstables —all” to rewrite the existing sstables with the new “current
key”.
Regards
On Sat, 13 Nov 2021 at 19:35,<sc...@paradoxica.net> wrote:
Same reaction here - great to have traction on this ticket. Shylaja, thanks for
your work on this and to Stefan as well! It would be wonderful to have the
feature complete.
One thing I’d mention is that a lot’s changed about the project’s testing
strategy since the original patch was written. I see that the 2016 version adds
a couple round-trip unit tests with a small amount of static data. It would be
good to see randomized tests fleshed out that exercise more of the read/write
path; or which add variants of existing read/write path tests that enable
encryption.
– Scott
On Nov 13, 2021, at 7:53 AM, Brandon Williams<dri...@gmail.com> wrote:
We already have a ticket and this predated CEPs, and being an
obviously good improvement to have that many have been asking for for
some time now, I don't see the need for a CEP here.
On Sat, Nov 13, 2021 at 5:01 AM Stefan Miklosovic
<stefan.mikloso...@instaclustr.com> wrote:
Hi list,
an engineer from Intel - Shylaja Kokoori (who is watching this list
closely) has retrofitted the original code from CASSANDRA-9633 work in
times of 3.4 to the current trunk with my help here and there, mostly
cosmetic.
I would like to know if there is a general consensus about me going to
create a CEP for this feature or what is your perception on this. I
know we have it a little bit backwards here as we should first discuss
and then code but I am super glad that we have some POC we can
elaborate further on and CEP would just cement and summarise the
approach / other implementation aspects of this feature.
I think that having 9633 merged will fill quite a big operational gap
when it comes to security. There are a lot of enterprises who desire
this feature so much. I can not remember when I last saw a ticket with
50 watchers which was inactive for such a long time.
Regards
---------------------------------------------------------------------
To unsubscribe, e-mail:dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail:dev-h...@cassandra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail:dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail:dev-h...@cassandra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail:dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail:dev-h...@cassandra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail:dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail:dev-h...@cassandra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail:dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail:dev-h...@cassandra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org