On Mon, Dec 28, 2020 at 5:24 PM Andy Fan <zhihui.fan1...@gmail.com> wrote: > lseek(..., SEEK_END) = 9670656 > write(...) = 8192 > lseek(..., SEEK_END) = 9678848 > fsync(...) = -1 > lseek(..., SEEK_END) = 9670656 > > I got 2 information from above. a) before the fsync, the lseek(fd, 0, > SEEK_END) > returns a correct value, however after the fsync, it returns a wrong value. > b). > the fsync failed(return values is -1) in the above case. I'm more confused > because of point a). Since the fsync can't correct some wrong value, what > is the point to call fsync in this patch (I agree that it is good to fix this > reliability problem within this patch, but I'm doubtful if fsync can help in > this case). Am I missing something obviously?
The point of the fsync() call isn't to correct the value, it's to find out if it is safe to drop the SR object from shared memory because the operating system has promised that it won't forget about the new size. If that fails, we'll report an ERROR or PANIC (depending on data_sync_retry), but not drop the SR. By the way, you don't need fsync(fd) for the size to change, as shown by the other experiment in that other thread that just did sleep(60). It might also happen asynchronously. fsync(fd) forces it, but also tells you about errors. > I read your patch with details some days ago and feel it is in pretty good > shape. > and I think you are clear about the existing issues very well. like a). > smgr_alloc_sr use a > FIFO design. b). SR_PARTITION_LOCK doesn't use a partition lock really. c). > and Yeah, it probably should have its own bank of locks (I wish they were easier to add, via lwlocknames.txt or similar). > looks like you have some concern about the concurrency issue, which I didn't > find > anything wrong. d). how to handle the DIRTY sr during evict. I planned to > recheck So yeah, mostly this discussion has been about not trusting lseek() and using our own tracking of relation size instead. But there is a separate question about how "fresh" the value needs to be. The question is: under what circumstances could the unlocked read of nblocks from shared memory give you a value that isn't fresh enough for your snapshot/scan? This involves thinking about implicit memory barriers. The idea of RECENTLY_DIRTIED is similar to what we do for buffers: while you're trying to "clean" it (in this case by calling fsync()) someone else can extend the relation again, and in that case we don't know whether the new extension was included in the fsync() or not, so we can't clear the DIRTY flag when it completes. > item c) before this reply, but looks like the time is not permitted. May I > know what > Is your plan about this patch? Aside from details, bugs etc discussed in this thread, one major piece remains incomplete: something in the background to "clean" SRs so that foreground processes rarely have to block.