Subject: Re: RFC: PostgreSQL Storage I/O Transformation Hooks Hi Zsolt,
Thank you for your detailed questions. I'll address each point: 1. Bundling WAL and Buffer Manager WAL and heap pages are simply different representations of the same underlying data. Protecting only one side would be cryptographically incomplete; an attacker could bypass encryption by reading the unprotected side. Therefore, they must be treated as a single atomic unit of protection. 2. Scope: Temporary Files, System Tables, and Frontend Tools I intentionally kept the scope focused. Past TDE proposals often stalled because they tried to solve everything at once, becoming too large to review. I prefer a "divide-and-conquer" approach: - Temporary files: Out of scope for this initial infrastructure proposal. - System tables: While they cannot be encrypted during bootstrap (since extensions aren't loaded), they can be transformed page-by-page during normal operation. - Frontend tools (pg_waldump, etc.): I am aware of this and have modified versions. Currently, there is no standard mechanism for frontend hooks, making this a broader challenge. For production, extensions could ship their own modified frontend tools temporarily. Long-term, we may need initdb-time configurations to unify backend/frontend hook behavior that are fixed for the lifetime of the cluster. 3. Why Hooks Instead of SMGR Please see my response to Konstantin in this thread regarding maintenance debt and the "Separation of Concerns" between storage management and data transformation. 4. Page Header Flags vs. Fork Files My primary concern with using fork files for encryption metadata is crash recovery. If a fork file and the actual data page become inconsistent (e.g., during a crash), recovery becomes problematic because fork files are not typically protected by WAL. Storing the Transform ID in the header flags ensures that the metadata travels with the page. This is essential for incremental key rotation, where pages are gradually re-encrypted with newer keys over time. The oldest key's pages are force-rotated, allowing continuous key rotation without service interruption. I plan to propose a separate RFC for this "gradual rotation" mechanism. 5. Benchmarks and Critical Section Overhead Transformation happens inside the critical section but before acquiring the WAL lock. On consumer-grade SSDs, the encryption latency is largely masked by I/O wait times with negligible performance impact. On high-performance storage (production SSDs, Apple Silicon, etc.), the reduced I/O wait exposes the encryption overhead, which is visible but modest. Detailed benchmarks require company approval - I will follow up later. Best regards, Henson Choi 2025년 12월 28일 (일) PM 10:12, Zsolt Parragi <[email protected]>님이 작성: > Hello! > > I am glad to see that there are multiple TDE extension proposals being > worked on. For context, I am one of the developers working on the > pg_tde[1] extension, as well as on the extensible SMGR proposal that > Konstantin already linked. > > This patch/proposal contains two distinct parts of > encryption/extensibility, WAL and buffer manager/table data. Based on > earlier discussions, the opinions of adding extension points to these > two are quite different, and because of that I'm not sure if bundling > them together is helpful. > > It also appears to be missing some extension points that would be > required for a more complete encryption solution, such as encrypting > temporary files or system tables, or handling command-line utilities > like pg_waldump. Do you have ideas or patches in mind for those areas > as well? > > I have the same question as Konstantin, why did you choose custom > hooks for the buffer manager instead of the already existing smgr > interface / extensibility patch? While that patch is not part of the > core (but I hope it will be), it is already used by multiple companies > as it supports other use cases, not only encryption. We plan to focus > more on that thread early next year, we would appreciate any > feedback/suggestions that could make it better for others. > > I also noticed that you added additional flags to the page header. > Initially we were thinking about something like this, but decided that > the fork files are better for any encryption (or other storage > related) extra data. These few bits try to be generic, while also > restrictive because of the limited amount of data. (and that data is > specifically per page, if I want something per file or per page range, > I still need a custom solution) > > Regarding the WAL encryption part, we took a completely different > approach, similar to how we handle normal table data (page-based). I > will need to think more about this before I can provide meaningful > feedback on that part of the patch. One initial question, however, is > whether you have run detailed benchmarks with different workloads. > That seems to be the trickiest part there, since most of the code runs > in a critical section. (Not the "unused"/"empty hook" path, but the > overhead caused by a real encryption plugin using this hook in > practice) > > > [1]: https://github.com/percona/pg_tde >
