On Thu, Aug 15, 2019 at 10:19 AM Bruce Momjian <br...@momjian.us> wrote: > > On Wed, Aug 14, 2019 at 04:36:35PM +0200, Antonin Houska wrote: > > I can work on it right away but don't know where to start. > > I think the big open question is whether there will be acceptance of an > all-cluster encyption feature. I guess if no one objects, we can move > forward. >
I still feel that we need to have per table/tablespace keys although it might not be the first implementation. I think the safeness of both table/tablespace level and cluster level would be almost the same but the former would have an advantage in terms of operation and performance. > > First, I think we should use a code repository to integrate [1] and [2] > > instead of sending diffs back and forth. That would force us to resolve > > conflicts soon and help to avoid duplicate work. The diffs would be created > > only whe we need to post the next patch version to pgsql-hackers for review, > > otherwise the discussions of details can take place elsewhere. > > Well, we can do that, or just follow the TODO list and apply items as we > complete them. We have found that doing everything in one big patch is > just too hard to review and get accepted. > > > The most difficult problem I see now regarding the collaboration is > > agreement > > on the key management user interface. The Full-Cluster Encryption feature > > [1] > > should not add configuration variables or even tools that the next, more > > sophisticated version [2] deprecates immediately. Part of the problem is > > that > > Yes, the all-cluster encryption feature has _no_ SQL-level API to > control it, just a GUC variable that you can use SHOW to see the > encryption mode. > > > [2] puts all (key management related) interaction of postgres with the > > environment into an external library. As I pointed out in my response to > > [2], > > this will not work for frontend applications (e.g. pg_waldump). I think the > > key management UI for [2] needs to be designed first even if PG 13 should > > adopt only [1]. > > I think there are several directions we can go after all-cluster > encryption, and it does matter because we would want minimal API > breakage. The options are: > > 1) Allow per-table encyption control to limit encryption overhead, > though all of WAL still needs to be encrypted; we could add a > per-record encyption flag to WAL records to avoid that. > > 2) Allow user-controlled keys, which are always unlocked, and encrypt > WAL with one key > > 3) Encrypt only the user-data portion of pages with user-controlled > keys. FREEZE and crash recovery work since only the user data is > encrypted. WAL is not encrypted, except for the user-data portion > > I think once we implement all-cluster encryption, there will be little > value to #1 unless we find that page encryption is a big performance > hit, which I think is unlikely based on performance tests so far. > > I don't think #2 has much value since the keys have to always be > unlocked to allow freeze and crash recovery. > > I don't think #3 is viable since there is too much information leakage, > particularly for indexes because the tid is visible. > > Now, if someone says they still want 2 & 3, which has happened many > times, explain how these issues can be reasonable addressed. > > I frankly think we will implement all-cluster encryption, and nothing > else. I think the next big encryption feature after that will be > client-side encryption support, which can be done now but is complex; > it needs to be easier. > > > At least it should be clear how [2] will retrieve the master key because [1] > > should not do it in a differnt way. (The GUC cluster_passphrase_command > > mentioned in [3] seems viable, although I think [1] uses approach which is > > more convenient if the passphrase should be read from console.) I think that we can also provide a way to pass encryption key directly to postmaster rather than using passphrase. Since it's common that user stores keys in KMS it's useful if we can do that. > > Rotation of > > the master key is another thing that both versions of the feature should do > > in > > the same way. And of course, the fronend applications need consistent > > approach > > too. > > I don't see the value of an external library for key storage. I think that big benefit is that PostgreSQL can seamlessly work with external services such as KMS. For instance, when key rotation, PostgreSQL can register new key to KMS and use it, and it can remove keys when it no longer necessary. That is, it can enable PostgreSQL to not only not only getting key from KMS but also registering and removing keys. And we also can decrypt MDEK in KMS instead of doing in PostgreSQL which is more safety. In addition, once someone create the plugin library of an external services individual projects don't need to create that. BTW I've created PoC patch for cluster encryption feature. Attached patch set has done some items of TODO list and some of them can be used even for finer granularity encryption. Anyway, the implemented components are followings: * Initialization stuff (initdb support). initdb has new command line options: --enc-cipher and --cluster-passphrase-command. --enc-cipher option accepts either aes-128 or aes-256 values while --cluster-passphrase-command accepts an arbitrary command. ControlFile has an integer indicating cluster encryption support, 'off', 'aes-128' or 'aes-256'. * 3-tier encryption keys. During initdb we create KEK and MDEK and write the meta data file(global/pg_kmgr file). When postmaster startup it reads the kmgr file, verifies the passphrase using HMAC, unwraps MDEK and derives TDEK and WDEK from MDEK. Currently MDEK, TDEK and WDEK are stored into shared memory as this is still PoC but we also can have them in process local memory. * All cryptographic functions are implemented using OpenSSL. Since HKDF and key wrap have been introduced in OpenSSL 1.1.0 it requires 1.1.0 or higher. * Buffer encryption. All tables and indexes data except for vm and fsm are transparently encrypted. Missing features so far are followings: * WAL encryption * Temporary file encryption * Command-line tool to change passphrase (KEK key rotation) * Front-end tool support (pg_waldump, pg_rewind) * Documentation * Regression tests Since some of above items are already implemented in other patches we can use them. We can create database cluster while enabling cluster encryption as follows: $ initdb -D data --enc-cipher=aes-128 --cluster-passphrase-command='echo "secret password"' $ pg_controldata | grep encryption Data encryption cipher: aes-128 $ pg_ctl start Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
0001-Introduce-cryptographic-functions-for-cluster-encryp.patch
Description: Binary data
0004-Introduce-buffer-encryption-for-cluster-encyrption.patch
Description: Binary data
0005-Skeleton-Introduce-WAL-encryption-for-cluster-encyrp.patch
Description: Binary data
0003-Enable-cluster-encryption-during-initdb.patch
Description: Binary data
0002-Add-key-management-module-for-cluster-encryption.patch
Description: Binary data