On Thu, 29 Jun 2023 at 13:20, John Morris <john.mor...@crunchydata.com> wrote: > > Background > > ========== > > PostgreSQL has an amazing variety of routines for accessing files. Consider > just the “open file” routines. > PathNameOpenFile, OpenTemporaryFile, BasicOpenFile, open, fopen, > BufFileCreateFileSet, > > BufFileOpenFileSet, AllocateFile, OpenTransientFile, FileSetCreate, > FileSetOpen, mdcreate, mdopen, > > Smgr_open, > > > > On the downside, “amazing variety” also means somewhat confusing and > difficult to add new features. > Someday, we’d like to add encryption or compression to the various PostgreSql > files. > To do that, we need to bring all the relevant files into a common file API > where we can implement > the new features. > > > > Goals of Patch > > ============= > > 1)Unify file access so most of “the other” files can go through a common > interface, allowing new features > like checksums, encryption or compression to be added transparently. 2) Do it > in a way which doesn’t > change the logic of current code. 3)Convert a reasonable set of callers to > use the new interface. > > > > Note the focus is on the “other” files. The buffer cache and the WAL have > similar needs, > but they are being done in a separate project. (yes, the two projects are > coordinating) > > Patch 0001. Create a common file API. > > =============================== > > Currrently, PostgreSQL files feed into three funnels. 1) system file > descriptors (read/write/open), > 2) C library buffered files (fread/fwri;te/fopn), and 3) virtual file > descriptors (FileRead/FileWrite/PathNameOpenFile). > Of these three, virtual file descriptors (VFDs) are the most common. They are > also the > only funnel which is implemented by PostgresSql. > > > > Decision: Choose VFDs as the common interface. > > > > Problem: VFDs are random access only. > > Solution: Add sequential read/write code on top of VFDs. (FileReadSeq, > FileWriteSeq, FileSeek, FileTell, O_APPEND) > > > > Problem: VFDs have minimal error handling (based on errno.) > > Solution: Add an “ferror” style interface (FileError, FileEof, FileErrorCode, > FileErrorMsg) > > > > Problem: Must maintain compatibility with existing error handling code. > > Solution: save and restore errno to minimize changes to existing code. > > > > Patch 0002. Update code to use the common file API > > =========================================== > > The second patch alters callers so they use VFDs rather than system or C > library files. > It doesn’t modify all callers, but it does capture many of the files which > need > to be encrypted or compressed. This is definitely WIP. > > > > Future (not too far away) > > ===================== > > Looking ahead, there will be another set of patches which inject buffering > and encryption into > the VFD interface. The future patches will build on the current work and > introduce new “oflags” > > to enable encryption and buffering. > > > Compression is also a possibility, but currently lower priority and a bit > tricky for random access files. > Let us know if you have a use case.
CFbot shows few compilation warnings/error at [1]: [15:54:06.825] ../src/backend/storage/file/fd.c:2420:11: warning: unused variable 'save_errno' [-Wunused-variable] [15:54:06.825] int ret, save_errno; [15:54:06.825] ^ [15:54:06.825] ../src/backend/storage/file/fd.c:4026:29: error: use of undeclared identifier 'MAXIMUM_VFD' [15:54:06.825] Assert(file >= 0 && file < MAXIMUM_VFD); [15:54:06.825] ^ [15:54:06.825] 1 warning and 1 error generated. [1] - https://cirrus-ci.com/task/6552527404007424 Regards, Vignesh