On 3/15/25 11:40, Alexander Korotkov wrote: > On Sun, Jan 5, 2025 at 1:43 AM Filip Janus <fja...@redhat.com> wrote: >> >> I apologize for multiple messages, but I found a small bug in the previous >> version. >> >> -Filip- > > Great, thank you for your work. > > I think the patches could use a pgindent run. > > I don't see a reason why the temp file compression method should be > different from the wal compression methods, which we already have > in-tree. Perhaps it would be nice to have a 0001 patch, which would > abstract the compression methods we now have for wal into a separate > file containing GUC option values and functions for > compress/decompress. Then, 0002 would apply this to temporary file > compression. >
Not sure I understand the design you're proposing ... AFAIK the WAL compression is not compressing the file data directly, it's compressing backup blocks one by one, which then get written to WAL as one piece of a record. So it's dealing with individual blocks, not files, and we already have API to compress blocks (well, it's pretty much the APIs for each compression method). You're proposing abstracting that into a separate file - what would be in that file? How would you abstract this to make it also useful for file compression? I can imagine a function CompressBufffer(method, dst, src, ...) wrapping the various compression methods, unifying the error handling, etc. I can imagine that, but that API is also limiting - e.g. how would that work with stream compression, which seems irrelevant for WAL, but might be very useful for tempfile compression. IIRC this is mostly why we didn't try to do such generic API for pg_dump compression, there's a local pg_dump-specific abstraction. FWIW looking at the patch, I still don't quite understand why it needs to correct the offset like this: + if (!file->compress) + file->curOffset -= (file->nbytes - file->pos); + else + if (nbytesOriginal - file->pos != 0) + /* curOffset must be corrected also if compression is + * enabled, nbytes was changed by compression but we + * have to use the original value of nbytes + */ + file->curOffset-=bytestowrite; It's not something introduced by the compression patch - the first part is what we used to do before. But I find it a bit confusing - isn't it mixing the correction of "logical file position" adjustment we did before, and also the adjustment possibly needed due to compression? In fact, isn't it going to fail if the code gets multiple loops in while (wpos < file->nbytes) { ... } because bytestowrite will be the value from the last loop? I haven't tried, but I guess writing wide tuples (more than 8k) might fail. regards -- Tomas Vondra