On Fri, Oct 4, 2019 at 09:18:58AM -0400, Robert Haas wrote: > I think everyone would agree that if you have no information about a > database other than the contents of pg_clog, that's not a meaningful > information leak. You would be able to tell which transactions > committed and which transactions aborted, but since you know nothing > about the data inside those transactions, it's of no use to you. > However, in that situation, you probably wouldn't be attacking the > database in the first place. Most likely you have some knowledge about > what it contains. Maybe there's a stream of sensor data that flows > into the database, and you can see that stream. By watching pg_clog, > you can see when a particular bit of data is rejected. That could be > valuable.
It is certainly true that seeing activity in _any_ cluster file could leak information. However, even if we encrypted all the cluster files, bad actors could still get information by analyzing the file sizes and size changes of relation files, and the speed of WAL creation, and even monitor WAL for write activity (WAL file byte changes). I would think that would leak more information than clog. I am not sure how you could secure against that information leak. While file system encryption might do that at the storage layer, it doesn't do anything at the mounted file system layer. The current approach is to encrypt anything that contains user data, which includes heap, index, and WAL files. I think replication slots and logical replication might also fall into that category, which is why I started this thread. I can see some saying that all cluster files should be encrypted, and I can respect that argument. However, as outlined in the diagram linked to from the blog entry: https://momjian.us/main/blogs/pgblog/2019.html#September_27_2019 I feel that TDE, since it has limited value, and can't really avoid all information leakage, should strive to find the intersection of ease of implementation, security, and compliance. If people don't think that limited file encryption is secure, I get it. However, encrypting most or all files I think would lead us into such a "difficult to implement" scope that I would not longer be able to work on this feature. I think the code complexity, fragility, potential unreliability, and even overhead of trying to encrypt most/all files would lead TDE to be greatly delayed or never implemented. I just couldn't recommend it. Now, I might be totally wrong, and encryption of everything might be just fine, but I have to pick my projects, and such an undertaking seems far too risky for me. Just for some detail, we have solved the block-level encryption problem by using CTR mode in most cases, but there is still a requirement for a nonce for every encryption operation. You can use derived keys too, but you need to set up those keys for every write to encrypt files. Maybe it is possible to set up a write API that handles this transparently in the code, but I don't know how to do that cleanly, and I doubt if the value of encrypting everything is worth it. As far as encrypting the log file, I can see us adding documentation to warn about that, and even issue a server log message if encryption is enabled and syslog is not being used. (I don't know how to test if syslog is being shipped to a remote server.) -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +