Hi Will, Thanks for your post and for spending significant time thinking about my proposal.
An important aspect of my proposed patch has, I think, been overlooked or under-examined. There is no essential need for the wrapping keys to ever be present in Erlang memory or, indeed, ever to leave some more secure enclave on some remote host or service. In my PR I have used the local couchdb config in order to demonstrate the functionality but I would not consider that a production mechanism, at least not without some significant refinement. You seem to imply that storing wrapped keys in the shard files is a security concern and I'd like to more clearly understand that concern, as I do not share it. The encrypted files can only be decrypted with the right encryption key, and the wrapped key at the start of the file can only be unwrapped by another key. Guessing either of these keys is equally infeasible. The motivation behind the (possible) use of multiple key slots is to allow an administrator to change the wrapping key in a safer manner. The start of the file would be preallocated with multiple slots, only one of which would be filled at the file's creation (using the current wrapping key). At any moment the administrator can specify a new wrapping key, and we would then wrap the existing key (which we'd need to unwrap with the 'old' key) with that new wrapping key and store it in a spare slot. If there were any kind of crash (a power failure, say), the old wrapped key is still there. We could, instead, choose to overwrite the single wrapped key with its new value and use the original trick of writing a 4 KiB value (a disk sector) with two copies of the same data, and try to exploit atomicity at the disk/disk controller level. I'm not a huge fan of that approach. Any version of this work, before it can be merged, must allow for all keys (or passwords/phrases) to be replaced by an administrator without data loss. I have no strong opinion on how we achieve this yet, only that we must. You said; > I don't think there is ever a point in combining this with a HSM/cryptoki/etc > hardware keystore I have the opposite feeling. The protections I'm proposing to add to .couch and .view files benefit most when the wrapping keys are generated, used, and exclusively stored within an HSM. There is no need for CouchDB to ever see them. All CouchDB needs to be able to do is to request, at any time, that the wrapped key, read from the start of the relevant file, is unwrapped. Permission to perform that unwrapping could be revoked at any time, and the wrapping key itself could be forensically destroyed. While files that are currently _open_ would still be readable (by couchdb and anyone able to introspect the erlang VM or the host memory), no new file could be opened. On Asymmetric keys, I don't understand your proposal well enough to usefully respond to it. There doesn't seem to be any way to apply them to this problem, where couchdb must perform both encryption and decryption of the same data. Asymmetric would make sense if these duties were split (if, say, a party were encrypting a .couch file to send to another person, they could encrypt it with the public key of the recipient, who could use their private key to decrypt it). I would be grateful if you could explain how asymmetric encryption could be used here in a way that doesn't require every party that holds the public key to also hold the corresponding private key (and vice versa). Your point on future expansion or new formats is well made. I anticipate that a little with the encryption header that I write at the very start of the file, before the wrapped key or the id of the wrapping key. We could use other values to indicate other formats. > On 17 May 2022, at 13:30, Will Young <lostnetwork...@gmail.com> wrote: > > Hi Robert, > > I've taken some time to think over your PR and writeup, and have the > following comments: > > benefits of the PR > I like this idea of native encryption a lot. While lower layers can > offer encryption, I think there are a lot more situations where the > lower layer has been delegated through cloud hosting, etc, and one is > not really sure it is providing the expected capabilities without some > unexpected caveat. I think native encryption should be very > appropriate in a situation where the main system volume can be small > and protected carefully but data volumes need to be cheap, large, easy > to backup. > > Expunging uncertainty and manual shared key management > I like systems like the regularly recycling of the per-shard key > trying to somewhat limit something like momentary full system read > access at one moment from inherently being able to snoop through old > data that could have been expunged and all future data (after rekeys > etc). I can understand why performance/design-wise the per-shard key > is best wrapped and stored in the shard itself, but I find it a bit > unfortunate for directly trusting data is expunged with low trust in > data volumes to not be snapshotting, accidentally sharing backups, > caching raw blocks, etc. Naturally, the current PR leaves choices open > on managing the wrapping keys so access to the production db's active > keys and backups doesn't have to always mean the abilitity to snoop > through all past history, etc, but a site can trying to manage between > the risks of data loss or insufficient key rotation has to consider a > fairly complex set of constraints manually. > > Completeness of shared key design > For symmetric encryption, I feel like the design for the wrapping is > as complete as could be. I struggled to think of a reason for > providing multiple wrapping slots in the shard header to give more > than one wrapping key access to the current shard key but I don't see > a lot of utility to multiple active shared wrapping keys. Multiple > shared key slots might function as safety net for the backup process > for the system to always be writing the current key and a future key > and then progress and generate the new future key once one is sure the > current key is not just on a filesystem that may fail, etc, but > handling that constraint could also be left to a site to design at > their own risk. > > Limits of shared keys > A. While hardware acceleration could in theory be used, I don't > think there is ever a point in combining this with a HSM/cryptoki/etc > hardware keystore. The relevant wrapping keys are always going to be > loaded into erlang's memory in production. > B. There are manual rekeying choices to try to manage the different > risks and changing pupose of a key as it ages, which may be difficult > for most sites to get right. > > Benefits of adding asymmetric keys > I think an additional asymmetric keying slot in the shard and > corresponding encrypt/decrypt references could allow a number of nicer > scenerios, for example: > 1. not having a private key in production that can read backups, but > encrypting everything for the current symmetric key and a backups > asymmetric key. This would allow getting down to very little access to > older symmetric keys in production and avoid risk of loss through loss > of shorter lived symmetric keys by using the offline backup key as the > insurance. > 2. having a handle to a hardware (HSM) key as the private key and > encrypting the shard keys to the publickey of this present token. This > removes the ability to break in and steal the wrapping keys for later > use. The attacker could ask the hardware token to decrypt each current > shard key or any backups they already have access to but cant squirel > away keys to combine with future access to backups, etc. > > Asymmetric keys left as a future capability > I don't think there's any reason why the current design could not > proceed and asymmetric keys be handled as a future addition. The only > things I see to consider in easing future compatibility with > asymmetric keys are: > A. It would only be helpful if the data format for encrypted shards is > flexible enough to not need a newer encrypted shard format later. > B. Looking at the configuration, I don't think anything with the > shared key configuration would need to change if there were an > addition an asymmetric slot later and asymmetric keys were added in > new sections. For example, only an asymmetric key section similar to > "encryption_keys" would need support for indicating > crypto:engine_key_ref() contents and similar. > > Kind Regards, > Will