Thanks.  We are on Greenplum GP 4.2 (Postgres 8.2).  As per GP suggestions,
we have 6 primary/6 mirror instances on each server. The server has 64 G
RAM, and shared_buffers is at ...125 MB :-).  I suppose the idea is for the
OS buffer cache to do the legwork.

But still...performance is at least "not bad".  If all HASH JOIN queries
touch shared_buffers in some way, I find it non-intuitive that we can have
concurrent hash queries involving big tables (100M+ joined with say 100K),
all apparently using the 125MB shared_buffers in some way, and yet giving
reasonable performance.

Basically what is the anatomy of a hash join involving large tables?
Disk->Shared_Buffers->Hash Join areas? something like that?


On Sun, Dec 29, 2013 at 9:18 AM, Michael Paquier
<michael.paqu...@gmail.com>wrote:

> On Sun, Dec 29, 2013 at 9:05 PM, Shiv Sharma <shiv.sharma.1...@gmail.com>
> wrote:
> > I am puzzled about the extent to which shared_bufferes is used for
> different
> > queries.  Do _all_ queries "touch" shared buffers at some point of their
> > execution?
> >
> > Many of our warehouse queries are  seq_scan followed by HASH.   I know
> > work_mem is assigned for HASH joins:  but does this mean that these
> queries
> > never touch shared buffers at *all* during their execution?  Perhaps they
> > are read into shared_buffers and then passed into work_mem HASH areas???
> >
> > What about updates on big tables? What about inserts  on big tables? What
> > about append-inserts?
> >
> > I think I could get these answers from Explain Analyze Buffers but I am
> on
> > 8.2 :-(
> >
> > Please tell me which queries use/touch shared_buffers in general terms,
> or
> > please point me to documentation.
> All your queries that interacts with relations.
>
> shared_buffers is used for data caching across all the backends of the
> server, to put it simply pages of the relation involved. Such data can
> be relation data, like data of a table you defined yourself, index
> data, or some system catalog data, containing definitions of the
> database objects. So simply everything that is a relation and contains
> physical data might be in shared buffers. Views for example do not
> enter in this category.
>
> You could for example use pg_buffercache to have a look at what
> contains the shared buffers:
> http://www.postgresql.org/docs/devel/static/pgbuffercache.html
>
> Regards,
> --
> Michael
>

Reply via email to