On Thu, Jul 12, 2018 at 8:32 AM Thomas Munro <thomas.mu...@enterprisedb.com>
wrote:

> On Thu, Jul 12, 2018 at 12:46 AM, Haribabu Kommi
> <kommi.harib...@gmail.com> wrote:
> >> > On 2018-04-30 14:59:31 +1200, Thomas Munro wrote:
> >> >> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show
> up
> >> >> as "reads" when in fact they are not reads at all:
> >> >>
> >> >> 1.  Relation extension, which in fact writes a zero-filled block.
> >> >> 2.  The RBM_ZERO_* modes, which provoke neither read nor write.
> >
> > I checked the patch and I agree with the change 1). And regarding change
> 2)
> > whether it is  zeroing the contents of the page or not, it does read?
> > because if it exists in the buffer pool, we are counting them as hits
> > irrespective
> > of the mode? Am I missing something?
>
> Further down in the function you can see that there is no read()
> system call for the RBM_ZERO_* modes:
>
>                 if (mode == RBM_ZERO_AND_LOCK || mode ==
> RBM_ZERO_AND_CLEANUP_LOCK)
>                         MemSet((char *) bufBlock, 0, BLCKSZ);
>                 else
>                 {
>                         ...
>                         smgrread(smgr, forkNum, blockNum, (char *)
> bufBlock);
>                         ...
>                 }
>

Thanks for the details. I got your point. But we need to include
RBM_ZERO_ON_ERROR case read operations, excluding others
are fine.


> I suppose someone might argue that even when it's not a hit and it's
> not a read, we might still want to count this buffer interaction in
> some other way.  Perhaps there should be a separate counter?  It may
> technically be a kind of cache miss, but it's nowhere near as
> expensive as a synchronous system call like read() so I didn't propose
> that.
>

Yes, I agree that we may need a new counter that counts the buffers that
are just allocated (no read or no write). But currently, may be the counter
value is very less, so people are not interested.


> Some more on my motivation:  In our zheap prototype, when the system
> is working well and we have enough space, we constantly allocate
> zeroed buffer pages at the insert point (= head) of an undo log and
> drop pages at the discard point (= tail) in the background;
> effectively a few pages just go round and round via the freelist and
> no read() or write() syscalls happen.  That's something I'm very happy
> about and it's one of our claimed advantages over the traditional heap
> (which tends to read and dirty more pages), but EXPLAIN (BUFFERS)
> hides this virtuous behaviour when comparing with the traditional
> heap: it falsely and slanderously reports that zheap is reading undo
> pages when it is not.  Of course I don't intent to litigate zheap
> design in this thread, I just I figured that since this accounting is
> wrong on principle and affects current PostgreSQL too (at least in
> theory) I would propose this little patch independently.  It's subtle
> enough that I wouldn't bother to back-patch it though.
>

OK. May be it is better to implement the buffer allocate counter along with
zheap to provide better buffer results?

Regards,
Haribabu Kommi
Fujitsu Australia

Reply via email to