On Thu, Jun 13, 2019 at 6:30 PM Jamison, Kirk <[email protected]> wrote:
>
> On Wednesday, June 12, 2019 4:29 PM (GMT+9), Masahiko Sawada wrote:
> > On Wed, Jun 12, 2019 at 12:25 PM Tsunakawa, Takayuki
> > <[email protected]> wrote:
> > >
> > > From: Tomas Vondra [mailto:[email protected]]
> > > > Years ago I've implemented an optimization for many DROP TABLE
> > > > commands in a single transaction - instead of scanning buffers for
> > > > each relation, the code now accumulates a small number of relations
> > > > into an array, and then does a bsearch for each buffer.
> > > >
> > > > Would something like that be applicable/useful here? That is, if we
> > > > do multiple TRUNCATE commands in a single transaction, can we
> > > > optimize it like this?
> > >
> > > Unfortunately not. VACUUM and autovacuum handles each table in a
> > > different
> > transaction.
> >
> > We do RelationTruncate() also when we truncate heaps that are created in the
> > current transactions or has a new relfilenodes in the current transaction.
> > So I think there is a room for optimization Thomas suggested, although I'm
> > not sure it's a popular use case.
>
> I couldn't think of a use case too.
>
> > I've not look at this patch deeply but in DropRelFileNodeBuffer I think we
> > can get the min value of all firstDelBlock and use it as the lower bound of
> > block number that we're interested in. That way we can skip checking the
> > array
> > during scanning the buffer pool.
>
> I'll take note of this suggestion.
> Could you help me expound more on this idea, skipping the internal loop by
> comparing the min and buffer descriptor (bufHdr)?
>
Yes. For example,
BlockNumber minBlock = InvalidBlockNumber;
(snip)
/* Get lower bound block number we're interested in */
for (i = 0; i < nforks; i++)
{
if (!BlockNumberIsValid(minBlock) ||
minBlock > firstDelBlock[i])
minBlock = firstDelBlock[i];
}
for (i = 0; i < NBuffers; i++)
{
(snip)
buf_state = LockBufHdr(bufHdr);
/* check with the lower bound and skip the loop */
if (bufHdr->tag.blockNum < minBlock)
{
UnlockBufHdr(bufHdr, buf_state);
continue;
}
for (k = 0; k < nforks; k++)
{
if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
bufHdr->tag.forkNum == forkNum[k] &&
bufHdr->tag.blockNum >= firstDelBlock[k])
But since we acquire the buffer header lock after all and the number
of the internal loops is small (at most 3 for now) the benefit will
not be big.
> In the current patch, I've implemented the following in
> DropRelFileNodeBuffers:
> for (i = 0; i < NBuffers; i++)
> {
> ...
> buf_state = LockBufHdr(bufHdr);
> for (k = 0; k < nforks; k++)
> {
> if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node)
> &&
> bufHdr->tag.forkNum == forkNum[k] &&
> bufHdr->tag.blockNum >= firstDelBlock[k])
> {
> InvalidateBuffer(bufHdr); /* releases
> spinlock */
> break;
> }
>
> > Don't we use each elements of nblocks for each fork? That is, each fork uses
> > an element at its fork number in the nblocks array and sets
> > InvalidBlockNumber
> > for invalid slots, instead of passing the valid number of elements. That way
> > the following code that exist at many places,
> >
> > blocks[nforks] = visibilitymap_mark_truncate(rel, nblocks);
> > if (BlockNumberIsValid(blocks[nforks]))
> > {
> > forks[nforks] = VISIBILITYMAP_FORKNUM;
> > nforks++;
> > }
> >
> > would become
> >
> > blocks[VISIBILITYMAP_FORKNUM] = visibilitymap_mark_truncate(rel,
> > nblocks);
>
> In the patch, we want to truncate all forks' blocks simultaneously, so
> we optimize the invalidation of buffers and reduce the number of loops
> using those values.
> The suggestion above would have to remove the forks array and its
> forksize (nforks), is it correct? But I think we’d need the fork array
> and nforks to execute the truncation all at once.
I meant that each forks can use the its forknumber'th element of
firstDelBlock[]. For example, if firstDelBlock = {1000,
InvalidBlockNumber, 20, InvalidBlockNumber}, we can invalid buffers
pertaining both greater than block number 1000 of main and greater
than block number 20 of vm. Since firstDelBlock[FSM_FORKNUM] ==
InvalidBlockNumber we don't invalid buffers of fsm.
As Tsunakawa-san mentioned, since your approach would reduce the loop
count your idea might be better than mine which always takes 4 loop
counts.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center