Hi Thomas, Thank you for your quick response.
On Thu, Dec 17, 2020 at 3:05 PM Thomas Munro <thomas.mu...@gmail.com> wrote: > Hi Andy, > > On Thu, Dec 17, 2020 at 7:29 PM Andy Fan <zhihui.fan1...@gmail.com> wrote: > > I spent one day studying the patch and I want to talk about one question > for now. > > What is the purpose of calling smgrimmedsync to evict a DIRTY sr (what > will happen > > if we remove it and the SR_SYNCING and SR_JUST_DIRTIED flags)? > > The underlying theory of the patch is that after fsync() succeeds, the > new size is permanent, but before that it could change at any time > because of asynchronous activities in the kernel*. Therefore, if you > want to evict the size from shared memory, you have to fsync() the > file first. I used smgrimmedsync() to do that. > > *I suspect that it can really only shrink with network file systems > such as NFS and GlusterFS, where the kernel has to take liberties to > avoid network round trips for every file extension, but I don't know > the details on all 9 of our supported OSes and standards aren't very > helpful for understanding buffered I/O. > Let me try to understand your point. Suppose process 1 extends a file to 2 blocks from 1 block, and fsync is not called, then a). the lseek *may* still return 1 based on the comments in the ReadBuffer_common ("because of buggy Linux kernels that sometimes return an seek SEEK_END result that doesn't account for a recent write"). b). When this patch is introduced, we would get 2 blocks for sure if we can get the size from cache. c). After user reads the 2 blocks from cache and then the cache is evicted, and user reads the blocks again, it is possible to get 1 again. Do we need to consider case a) in this patch? and Is case c). the situation you want to avoid and do we have to introduce fsync to archive this? Basically I think case a) should not happen by practice so we don't need to handle case c) and finally we don't need the fsync/SR_SYNCING/SR_JUST_DIRTIED flag. I'm not opposed to adding them, I am just interested in the background in case you are also willing to discuss it. I would suggest the below change for smgrnblock_fast, FYI @@ -182,7 +182,11 @@ smgrnblocks_fast(SMgrRelation reln, ForkNumber forknum) /* * The generation doesn't match, the shared relation must have been * evicted since we got a pointer to it. We'll need to do more work. + * and invalid the fast path for next time. */ + reln->smgr_shared = NULL; } -- Best Regards Andy Fan