On 11/12/14, 1:54 AM, David Rowley wrote:
On Tue, Nov 11, 2014 at 9:29 PM, Simon Riggs <si...@2ndquadrant.com
<mailto:si...@2ndquadrant.com>> wrote:
This plan type is widely used in reporting queries, so will hit the
mainline of BI applications and many Mat View creations.
This will allow SELECT count(*) FROM foo to go faster also.
We'd also need to add some infrastructure to merge aggregate states together
for this to work properly. This means that could also work for avg() and stddev
etc. For max() and min() the merge functions would likely just be the same as
the transition functions.
Sanity check: what % of a large aggregate query fed by a seqscan actually spent
in the aggregate functions? Even if you look strictly at CPU cost, isn't there
more code involved to get data to the aggregate function than in the
aggregation itself, except maybe for numeric?
In other words, I suspect that just having a dirt-simple parallel SeqScan could
be a win for CPU. It should certainly be a win IO-wise; in my experience we're
not very good at maxing out IO systems.
(I was curious and came up with the list below for just the page-level stuff
(ignoring IO). I don't see much code involved in per-tuple work, but I also
never came across detoasting code, so I suspect I'm missing something...)
ExecScanFetch, heapgettup_pagemode, ReadBuffer, BufferAlloc,
heap_page_prune_opt, LWLockAcquire... then you can finally do per-tuple work.
HeapTupleSatisfiesVisibility.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers