Am Do., 19. Mai 2022 um 15:54 Uhr schrieb Robert Newson <rnew...@apache.org>:
>
> Hi,
>
> My proposal is not about backups, encrypted or otherwise, though I can see 
> there's a relationship. Could the built-in encryption of my proposal also be 
> suitable for protecting a backup of these files? Yes, I think so. Given key 
> rotation we would expect to eventually have backups that need a wrapping key 
> that is no longer the current one, hence the need we both perceive for 
> multiple key slots. We differ only in that I pictured filling in the empty 
> slots some time after file creation, and merely as a way to avoid a lock-step 
> rotation.

Yes, so my pulling in backups is that (on top of practical process
matters,) they should have archival requirements that forbid(?) us
from tampering with them as a snapshot (i.e. if we go into every shard
and rekey or just zero a risky wrapping key's slot during backups have
we verifiably backed up these shards or have we made some novel
artifact and then backed it up?) That relates a bit to the question of
embedded slots, which I agree would be a shame not to use, but imply
that one couldn't have the node using detached wrappings for shard
keys in /etc or /var and the backups building detached wrappings for
archival keys to work similarly in conjunction with a pure snapshot of
the data volume. So I think the shards themselves are unavoidably
hooks into matching key management and security risks with backups.

>
> You wondered if encryption should be optional. That's a good topic. In my 
> view it's a "yes". Encryption is optional, admins should be able to configure 
> encryption for any subset of databases, including none and all databases. It 
> should be possible to configure CouchDB so that it unencrypts your databases 
> (via compaction). It would also be useful if the wrapping key could vary 
> between databases (it doesn't appear to be useful to go more granular than 
> that). So perhaps it is DatabaseName in the callback functions and not 
> WrappingKeyId.
>
> I agree that we'll need the ability to have multiple key slots. I hadn't 
> considered that we'd fill more than one slot at couch_file creation time but 
> I don't see why not. We can delegate that to the key manager;
>
> -callback new_key(DatabaseName :: binary()) ->
>     {ok, [WrappedKey :: binary()], UnwrappedKey :: binary()} | {error, Reason 
> ::
> term()}.
>

  Yes, I think it makes a lot of sense to refer to by database and
allow this kind of policy granularity. I could imagine scenerios like
a site wanting to get complex with per-user-database key management
and it would be very neat if they had all the necessary pieces if they
want to try to write their own key manager for that.


> The key manager might send back a list of one item or several, and couch_file 
> is simply obliged to record them at the start of the file. We would maybe 
> also want to ensure there are empty slots available, so there might need to 
> be a callback on the lines of;
>
> -callback slot_size() -> pos_integer().
>
> So we can know how much space to leave at the start of the file for empty 
> slots.
>
> The unwrap callback in this scheme would be essentially your revised proposal;
>
> -callback unwrap_key(DatabaseName :: binary(), [WrappedKey::binary()]) ->
>     {ok, UnwrappedKey :: binary() | {error, Reason :: term()}.
>
> I am wary of adding any code path in couchdb where we write anywhere but the 
> end of the file, so the actual process of filling in a preallocated empty 
> slot will need more thought. The atomicity of disk writes in theory and 
> practice come into play and will likely force some decisions. For example we 
> might be obliged to round up to the nearest 4 KiB (or disk sector size of the 
> storage device if we can retrieve that; though it's probably 4 KiB).
>
> Another option is to store the wrapped keys in the db headers but this 
> presents a few difficulties. couch_file itself has no idea what is in the 
> headers, only that they are 4 KiB-aligned and have the magic bit set at the 
> start that indicates it has found a real header. So there's a layering issue 
> there, but I think we can solve that. The other issue, though, is that the 
> header itself could not be encrypted. I have a strong preference for 
> encrypting every byte of the file.

Yes, I agree. While overall integrity of backups, volumes, etc, is not
really a feature of the encryption, I also feel that once there is
encryption, there is more interest toward taking the attacks that
would have stopped at the previously achievable goal of taking the
cleartext data toward options in manipulating the data/layout to try
to get something out of a crash, etc, and layers that stay within the
encrypted portion offer no predictable way to do anything like that.

Will

Reply via email to