On Thu, Nov 5, 2020 at 1:59 PM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > > At Thu, 5 Nov 2020 11:07:21 +0530, Amit Kapila <amit.kapil...@gmail.com> > wrote in > > On Thu, Nov 5, 2020 at 8:26 AM k.jami...@fujitsu.com > > <k.jami...@fujitsu.com> wrote: > > > > > Few comments on patches: > > > > > ====================== > > > > > v29-0002-Add-bool-param-in-smgrnblocks-for-cached-blocks > > > > > ---------------------------------------------------------------------- > > > > > ------------- > > > > > 1. > > > > > -smgrnblocks(SMgrRelation reln, ForkNumber forknum) > > > > > +smgrnblocks(SMgrRelation reln, ForkNumber forknum, bool *accurate) > > > > > { > > > > > BlockNumber result; > > > > > > > > > > /* > > > > > * For now, we only use cached values in recovery due to lack of a > > > > > shared > > > > > - * invalidation mechanism for changes in file size. > > > > > + * invalidation mechanism for changes in file size. The cached > > > > > + values > > > > > + * could be smaller than the actual number of existing buffers of > > > > > the file. > > > > > + * This is caused by lseek of buggy Linux kernels that might not have > > > > > + * accounted for the recent write. > > > > > */ > > > > > if (InRecovery && reln->smgr_cached_nblocks[forknum] != > > > > > InvalidBlockNumber) > > > > > + { > > > > > + if (accurate != NULL) > > > > > + *accurate = true; > > > > > + > > > > > > > > > > I don't understand this comment. Few emails back, I think we have > > > > > discussed that cached value can't be less than the number of buffers > > > > > during recovery. If that happens to be true then we have some problem. > > > > > If you want to explain 'accurate' variable then you can do the same > > > > > atop of function. Would it be better to name this variable as > > > > > 'cached'? > > > > > > > > (I agree that the comment needs to be fixed.) > > > > > > > > FWIW I don't think 'cached' suggests the characteristics of the > > > > returned value > > > > on its interface. It was introduced to reduce fseek() calls, and after > > > > that we > > > > have found that it can be regarded as the authoritative source of the > > > > file size. > > > > The "accurate" means that it is guaranteed that we don't have a buffer > > > > for the > > > > file blocks further than that number. I don't come up with a more > > > > proper > > > > word than "accurate" but also I don't think "cached" is proper here. > > > > > > > Sure but that is not the guarantee this API gives. It has to be > > guaranteed by the logic else-where, so not sure if it is a good idea > > to try to reflect the same here. The comments in the caller where we > > use this should explain why it is safe to use this value. > > Isn't it already guaranteed by the bugmgr code that we don't have > buffers for nonexistent file blocks? What is needed here is, yeah, > the returned value from smgrblocks is "reliable". If "reliable" is > still not proper, I give up and agree to "cached". >
I still feel 'cached' is a better name. > > > > I am not sure if the patch should cover this or should be a separate > > > thread altogether since > > > a number of functions also rely on the smgrnblocks(). But I'll take it > > > into consideration. > > > > > > > > > > > v29-0003-Optimize-DropRelFileNodeBuffers-during-recovery > > > > > ---------------------------------------------------------------------- > > > > > ------------ > > > > > 2. > > > > > + /* Check that it is in the buffer pool. If not, do nothing. */ > > > > > + LWLockAcquire(bufPartitionLock, LW_SHARED); buf_id = > > > > > + BufTableLookup(&bufTag, bufHash); LWLockRelease(bufPartitionLock); > > > > > + > > > > > + if (buf_id < 0) > > > > > + continue; > > > > > + > > > > > + bufHdr = GetBufferDescriptor(buf_id); > > > > > + > > > > > + buf_state = LockBufHdr(bufHdr); > > > > > + > > > > > + if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) && > > > > > > > > > > I think a pre-check for RelFileNode might be better before LockBufHdr > > > > > for the reasons mentioned in this function few lines down. > > > > > > > > The equivalent check is already done by BufTableLookup(). The last line > > > > in > > > > the above is not a precheck but the final check. > > > > > > > Which check in that API you are talking about? Are you telling because > > we are trying to use a hash value corresponding to rnode.node to find > > the block then I don't think it is equivalent because there is a > > difference in actual values. But even if we want to rely on that, a > > comment is required but I guess we can do the check as well because it > > shouldn't be a costly pre-check. > > I think the only problematic case is that BufTableLookup wrongly > misses buffers actually to be dropped. (And the case of too-many > false-positives, not critical though.) If omission is the case, we > cannot adopt this optimization at all. And if the false-positive is > the case, maybe we need to adopt more restrictive prechecking, but > RelFileNodeEquals is *not* more restrictive than BufTableLookup in the > first place. > > What case do you think is problematic when considering > BufTableLookup() as the prechecking? > I was slightly worried about false-positives but on again thinking about it, I think we don't need any additional pre-check here. -- With Regards, Amit Kapila.