Hi Thomas,

Thank you for your quick response.

On Thu, Dec 17, 2020 at 3:05 PM Thomas Munro <thomas.mu...@gmail.com> wrote:

> Hi Andy,
>
> On Thu, Dec 17, 2020 at 7:29 PM Andy Fan <zhihui.fan1...@gmail.com> wrote:
> > I spent one day studying the patch and I want to talk about one question
> for now.
> > What is the purpose of calling smgrimmedsync to evict a DIRTY sr (what
> will happen
> > if we remove it and the SR_SYNCING and SR_JUST_DIRTIED flags)?
>
> The underlying theory of the patch is that after fsync() succeeds, the
> new size is permanent, but before that it could change at any time
> because of asynchronous activities in the kernel*.  Therefore, if you
> want to evict the size from shared memory, you have to fsync() the
> file first.  I used smgrimmedsync() to do that.
>
> *I suspect that it can really only shrink with network file systems
> such as NFS and GlusterFS, where the kernel has to take liberties to
> avoid network round trips for every file extension, but I don't know
> the details on all 9 of our supported OSes and standards aren't very
> helpful for understanding buffered I/O.
>

Let me try to understand your point.  Suppose process 1 extends a file to
2 blocks from 1 block, and fsync is not called, then a). the lseek *may*
still
return 1 based on the comments in the ReadBuffer_common ("because
of buggy Linux kernels that sometimes return an seek SEEK_END result
that doesn't account for a recent write").  b). When this patch is
introduced,
we would get 2 blocks for sure if we can get the size from cache. c). After
user reads the 2 blocks from cache and then the cache is evicted, and user
reads the blocks again, it is possible to get 1 again.

Do we need to consider case a) in this patch? and Is case c). the situation
you
want to avoid and do we have to introduce fsync to archive this? Basically I
think case a) should not happen by practice so we don't need to handle case
c) and finally we don't need the fsync/SR_SYNCING/SR_JUST_DIRTIED flag.
I'm not opposed to adding them, I am just interested in the background in
case
you are also willing to discuss it.

I would suggest the below change for smgrnblock_fast,  FYI

@@ -182,7 +182,11 @@ smgrnblocks_fast(SMgrRelation reln, ForkNumber forknum)
                /*
                 * The generation doesn't match, the shared relation must
have been
                 * evicted since we got a pointer to it.  We'll need to do
more work.
+                * and invalid the fast path for next time.
                 */
+               reln->smgr_shared = NULL;
        }

-- 
Best Regards
Andy Fan

Reply via email to