Thanks. We are on Greenplum GP 4.2 (Postgres 8.2). As per GP suggestions, we have 6 primary/6 mirror instances on each server. The server has 64 G RAM, and shared_buffers is at ...125 MB :-). I suppose the idea is for the OS buffer cache to do the legwork.
But still...performance is at least "not bad". If all HASH JOIN queries touch shared_buffers in some way, I find it non-intuitive that we can have concurrent hash queries involving big tables (100M+ joined with say 100K), all apparently using the 125MB shared_buffers in some way, and yet giving reasonable performance. Basically what is the anatomy of a hash join involving large tables? Disk->Shared_Buffers->Hash Join areas? something like that? On Sun, Dec 29, 2013 at 9:18 AM, Michael Paquier <michael.paqu...@gmail.com>wrote: > On Sun, Dec 29, 2013 at 9:05 PM, Shiv Sharma <shiv.sharma.1...@gmail.com> > wrote: > > I am puzzled about the extent to which shared_bufferes is used for > different > > queries. Do _all_ queries "touch" shared buffers at some point of their > > execution? > > > > Many of our warehouse queries are seq_scan followed by HASH. I know > > work_mem is assigned for HASH joins: but does this mean that these > queries > > never touch shared buffers at *all* during their execution? Perhaps they > > are read into shared_buffers and then passed into work_mem HASH areas??? > > > > What about updates on big tables? What about inserts on big tables? What > > about append-inserts? > > > > I think I could get these answers from Explain Analyze Buffers but I am > on > > 8.2 :-( > > > > Please tell me which queries use/touch shared_buffers in general terms, > or > > please point me to documentation. > All your queries that interacts with relations. > > shared_buffers is used for data caching across all the backends of the > server, to put it simply pages of the relation involved. Such data can > be relation data, like data of a table you defined yourself, index > data, or some system catalog data, containing definitions of the > database objects. So simply everything that is a relation and contains > physical data might be in shared buffers. Views for example do not > enter in this category. > > You could for example use pg_buffercache to have a look at what > contains the shared buffers: > http://www.postgresql.org/docs/devel/static/pgbuffercache.html > > Regards, > -- > Michael >