Here I'm starting a new thread to discuss a topic that's related to the Transparent Data Encryption (TDE), but could be useful even without that. The problem has been addressed somehow in the Cybertec TDE fork, and I can post the code here if it helps. However, after reading [1] (and the posts upthread), I've got another idea, so let's try to discuss it first.
It makes sense to me if we first implement the buffering (i.e. writing/reading certain amount of data at a time) and make the related functions aware of encryption later: as long as we use a block cipher, we also need to read/write (suitably sized) chunks rather than individual bytes (or arbitrary amounts of data). (In theory, someone might need encryption but reject buffering, but I'm not sure if this is a realistic use case.) For the buffering, I imagine a "file stream" object that user creates on the top of a file descriptor, such as FileStream *FileStreamCreate(File file, int buffer_size) or FileStream *FileStreamCreateFD(int fd, int buffer_size) and uses functions like int FileStreamWrite(FileStream *stream, char *buffer, int amount) and int FileStreamRead(FileStream *stream, char *buffer, int amount) to write and read data respectively. Besides functions to close the streams explicitly (e.g. FileStreamClose() / FileStreamFDClose()), we'd need to ensure automatic closing where that happens to the file. For example, if OpenTemporaryFile() was used to obtain the file descriptor, the user expects that the file will be closed and deleted on transaction boundary, so the corresponding stream should be freed automatically as well. To avoid code duplication, buffile.c should use these streams internally as well, as it also performs buffering. (Here we'd also need functions to change reading/writing position.) Once we implement the encryption, we might need add an argument to the FileStreamCreate...() functions that helps to generate an unique IV, but the ...Read() / ...Write() functions would stay intact. And possibly one more argument to specify the kind of cipher, in case we support more than one. I think that's enough to start the discussion. Thanks for feedback in advance. [1] https://www.postgresql.org/message-id/CA%2BTgmoYGjN_f%3DFCErX49bzjhNG%2BGoctY%2Ba%2BXhNRWCVvDY8U74w%40mail.gmail.com -- Antonin Houska Web: https://www.cybertec-postgresql.com