Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

Bruce Momjian Sun, 28 Jul 2019 19:34:19 -0700

On Thu, Jul 25, 2019 at 01:03:06PM -0400, Bruce Momjian wrote:
> On Tue, Jul 16, 2019 at 01:24:54PM +0900, Masahiko Sawada wrote:
> > On Sat, Jul 13, 2019 at 12:33 AM Bruce Momjian <br...@momjian.us> wrote:
> > > then each row change gets its own LSN.  You are asking if an update that
> > > just expires one row and adds it to a new page gets the same LSN.  I
> > > don't know.
> > 
> > The following scripts can reproduce that different two pages have the same 
> > LSN.
> > 
> > =# create table test (a int);
> > CREATE TABLE
> > =# insert into test select generate_series(1, 226);
> > INSERT 0 226
> > =# update test set a = a where a = 1;
> > UPDATE 1
> > =# select lsn from page_header(get_raw_page('test', 0));
> >     lsn
> > -----------
> >  0/1690488
> > (1 row)
> > 
> > =# select lsn from page_header(get_raw_page('test', 1));
> >     lsn
> > -----------
> >  0/1690488
> > (1 row)
> > 
> > So I think it's better to use LSN and page number to create IV. If we
> > modify different tables by single WAL we also would need OID or
> > relfilenode but I don't think currently we have such operations.
> 
> OK, good to know, thanks.


I did some more research on which cases use a single LSN to modify
multiple 8k pages.  The normal program flow is:

        XLogBeginInsert();
        ...
        XLogRegisterBuffer(0, meta, ...
-->     recptr = XLogInsert(RM_BRIN_ID, XLOG_...

        page = BufferGetPage(meta);
        PageSetLSN(page, recptr);

XLogInsert() calls BufferGetTag(), which fills in the buffer's
RelFileNode (which internally is the tablespace, database, and
pg_class.relfilenode).  So, to use the LSN and page-number for the IV,
we need to make sure that there is no other encryption of those values
in a different relation.  What I did was to find cases where
XLogRegisterBuffer/PageSetLSN are called more than once for a single
LSN.  I found cases in:

        brin_doupdate
        brin_doinsert
        brinRevmapDesummarizeRange
        revmap_physical_extend
        GenericXLogFinish
        ginPlaceToPage
        shiftList
        ginDeletePage
        gistXLogSplit
        gistXLogPageDelete
        gistXLogUpdate
        hashbucketcleanup
        _hash_doinsert
        _hash_vacuum_one_page
        _hash_addovflpage
        _hash_freeovflpage
        _hash_squeezebucket
        _hash_init
        _hash_expandtable
        _hash_splitbucket
        log_heap_visible
        log_heap_update
        _bt_insertonpg
        _bt_split
        _bt_newroot
        _bt_getroot
        _bt_mark_page_halfdead
        _bt_unlink_halfdead_page
        addLeafTuple
        moveLeafs
        doPickSplit
        spgSplitNodeAction
        log_newpage_range

Most of these are either updating the different pages in the same
relation (so the page-number for the IV would be different), or are
modifying other types of files, like vm.  (We have not discussed if we
are going to encrypt vm or fsm.  I am guessing we are not.)

You might say, well, is it terrible if we reuse the LSN in a different
relation with the same page number?  Yes.  The way CTR works, it
generates a stream of bits using the key and IV (which will be LSN and
page number).  It then XORs it with the page contents to encrypt it.  If
we encrypt an all-zero gap in a page, or a place in the page where the
page format is known, a user can XOR that with the encrypted data and
get the bit stream at that point.  They can then go to another page that
uses the same key and IV and XOR that to get the decrypted data.  CBC
mode is slightly better because it mixes the user data into the future
16-byte blocks, but lots of our early-byte page format is known, so it
isn't great.

You might say, wow, that is a lot of places to make sure we don't reuse
the LSN in a different relation with the same page number --- let's mix
the relfilenode in the IV so we are sure the IV is not reused.

Well, the pg_class.relfilenode is only unique within the
tablespace/database, i.e., from src/include/storage/relfilenode.h:

 * relNode identifies the specific relation.  relNode corresponds to
 * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
 * to assign new physical files to relations in some situations).
 * Notice that relNode is only unique within a database in a particular
 * tablespace.

So, we would need to mix the tablespace, database oid, and relfilenode
into the IV to be unique.  We would then need to have pg_upgrade
preserve the relfilenode, change CREATE DATABASE to decrypt/encrypt when
creating a new database, and no longer allow files to be moved between
tablespaces without decryption/encryption.

There are just a whole host of complexities we add to encryption if we
add the requirement of preserving the refilenode, tablespace, and
database to decrypt each page.  I just don't think we want go there
unless we have a valid reason.

I am thinking of writing some Assert() code that checks that all buffers
using a single LSN are from the same relation (and therefore different
page numbers).  I would do it by creating a static array, clearing it on
XLogBeginInsert(), adding to it for each  XLogInsert(), then checking on
PageSetLSN() that everything in the array is from the same file.  Does
that make sense?

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

Reply via email to