On Fri, Jul 05, 2019 at 03:38:28PM -0400, Bruce Momjian wrote:
On Sun, Jun 16, 2019 at 03:57:46PM -0400, Stephen Frost wrote:
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Sun, Jun 16, 2019 at 12:42:55PM -0400, Joe Conway wrote:
> > On 6/16/19 9:45 AM, Bruce Momjian wrote:
> > > On Sun, Jun 16, 2019 at 07:07:20AM -0400, Joe Conway wrote:
> > >> In any case it doesn't address my first point, which is limiting the
> > >> volume encrypted with the same key. Another valid reason is you might
> > >> have data at varying sensitivity levels and prefer different keys be
> > >> used for each level.
> > >
> > > That seems quite complex.
> >
> > How? It is no more complex than encrypting at the tablespace level
> > already gives you - in that case you get this property for free if you
> > care to use it.
>
> All keys used to encrypt WAL data must be unlocked at all times or crash
> recovery, PITR, and replication will not stop when it hits a locked key.
> Given that, how much value is there in allowing a key per tablespace?

There's a few different things to discuss here, admittedly, but I don't
think it means that there's no value in having a key per tablespace.

Ideally, a given backend would only need, and only have access to, the
keys for the tablespaces that it is allowed to operate on.  I realize
that's a bit farther than what we're talking about today, but hopefully
not too much to be able to consider.

What people really want with more-granular-than-cluster encryption is
the ability to supply their passphrase key _when_ they want to access
their data, and then leave and be sure their data is secure from
decryption.  That will not be possible since the WAL will be encrypted
and any replay of it will need their passphrase key to unlock it, or the
entire system will be unrecoverable.

This is a fundamental issue, and will eventually doom any more granular
encryption approach, unless we want to use the same key for all
encrypted tablespaces, create separate WALs for each tablespace, or say
recovery of some tablespaces will fail.  I doubt any of those will be
acceptable.


I agree this is a pretty crucial challenge, and those requirements seem
in direct conflict. Users use encryption to protect privacy of the data,
but we need access to some of the data to implement some of the
important tasks of a RDBMS.

And it's not just about things like recovery or replication. How do you
do ANALYZE on encrypted data? Sure, if a user runs it in a session that
has the right key, that's fine. But what about autovacuum/autoanalyze?

I suspect the issue here is that we're trying to retrofit a solution for
data-at-rest encryption to something that seems closer to protecting
data during execution.

Which is a worthwhile goal, of course, but perhaps we're trying to use
the wrong tool to achieve it? To paraphrase the hammer/nail saying "If
all you know is a block encryption, everything looks like a block."


What if the granular encryption (not the "whole cluster with a single
key") case does not encrypt whole blocks, but just tuple data? Would
that allow at least the most critical WAL use cases (recovery, physical
replication) to work without having to know all the encryption keys?

Of course, that would be a much less efficient compared to plain block
encryption, but that may be the "natural cost" of the feature.

It would not solve e.g. logical replication or ANALYZE, which both
require access to the plaintext data, though.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to