On Sat, Feb 26, 2022 at 9:17 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > > On Fri, Feb 25, 2022 at 11:17 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Sat, Feb 26, 2022 at 3:01 AM Melanie Plageman > > <melanieplage...@gmail.com> wrote: > > > > > > Since _hash_alloc_buckets() WAL-logs the last page of the > > > splitpoint, is it safe to skip the smgrimmedsync()? What if the last > > > page of the splitpoint doesn't end up having any tuples added to it > > > during the index build and the redo pointer is moved past the WAL for > > > this page and then later there is a crash sometime before this page > > > makes it to permanent storage. Does it matter that this page is lost? If > > > not, then why bother WAL-logging it? > > > > > > > I think we don't care if the page is lost before we update the > > meta-page in the caller because we will try to reallocate in that > > case. But we do care after meta page update (having the updated > > information about this extension via different masks) in which case we > > won't lose this last page because it would have registered the sync > > request for it via sgmrextend before meta page update. > > and could it happen that during smgrextend() for the last page, a > checkpoint starts and finishes between FileWrite() and > register_dirty_segment(), then index build finishes, and then a crash > occurs before another checkpoint completes the pending fsync for that > last page? >
Yeah, this seems to be possible and then the problem could be that index's idea and smgr's idea for EOF could be different which could lead to a problem when we try to get a new page via _hash_getnewbuf(). If this theory turns out to be true then probably, we can get an error either because of disk full or the index might request a block that is beyond EOF as determined by RelationGetNumberOfBlocksInFork() in _hash_getnewbuf(). Can we try to reproduce this scenario with the help of a debugger to see if we are missing something? -- With Regards, Amit Kapila.