On Fri, 2020-03-27 at 02:31 +0100, Tomas Vondra wrote:
> On Thu, Mar 26, 2020 at 05:56:56PM +0800, Richard Guo wrote:
> > If nbatches is some number between 1.0 and 2.0, we would have a
> > negative
> > depth. As a result, we may have a negative cost for hash
> > aggregation
> > plan node, as descr
On Thu, Mar 26, 2020 at 05:56:56PM +0800, Richard Guo wrote:
Hello,
When calculating the disk costs of hash aggregation that spills to disk,
there is something wrong with how we determine depth:
depth = ceil( log(nbatches - 1) / log(num_partitions) );
If nbatches is some number be
Hello,
When calculating the disk costs of hash aggregation that spills to disk,
there is something wrong with how we determine depth:
>depth = ceil( log(nbatches - 1) / log(num_partitions) );
If nbatches is some number between 1.0 and 2.0, we would have a negative
depth. As a result,
On Fri, Mar 20, 2020 at 1:20 PM Pengzhou Tang wrote:
> Hi,
>
> I happen to notice that "set enable_sort to false" cannot guarantee the
> planner to use hashagg in test groupingsets.sql,
> the following comparing results of sortagg and hashagg seems to have no
> meaning.
>
>
Please forget my comme
Hi,
I happen to notice that "set enable_sort to false" cannot guarantee the
planner to use hashagg in test groupingsets.sql,
the following comparing results of sortagg and hashagg seems to have no
meaning.
Thanks,
Pengzhou
On Thu, Mar 19, 2020 at 7:36 AM Jeff Davis wrote:
>
> Committed.
>
> Th
On Sun, Mar 15, 2020 at 04:05:37PM -0700, Jeff Davis wrote:
> > + if (from_tape)
> > + partition_mem += HASHAGG_READ_BUFFER_SIZE;
> > + partition_mem = npartitions * HASHAGG_WRITE_BUFFER_SIZE;
> >
> > => That looks wrong ; should say += ?
>
> Good catch! Fixed.
> +++ b/src/backend/
On Wed, Mar 18, 2020 at 04:35:57PM -0700, Jeff Davis wrote:
Committed.
\o/
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Committed.
There's some future work that would be nice (some of these are just
ideas and may not be worth it):
* Refactor MemoryContextMemAllocated() to be a part of
MemoryContextStats(), but allow it to avoid walking through the blocks
and freelists.
* Improve the choice of the initial number
On Thu, 2020-03-12 at 16:01 -0500, Justin Pryzby wrote:
> I don't understand what's meant by "the chosen plan".
> Should it say, "at execution ..." instead of "execution time" ?
I removed that wording; hopefully it's more clear without it?
> Either remove "plan types" for consistency with
> enabl
On Wed, Mar 11, 2020 at 11:55:35PM -0700, Jeff Davis wrote:
> * tweaked EXPLAIN output some more
> Unless I (or someone else) finds something significant, this is close
> to commit.
Thanks for working on this ; I finally made a pass over the patch.
+++ b/doc/src/sgml/config.sgml
+ enable_gr
On Wed, 2020-02-26 at 19:14 -0800, Jeff Davis wrote:
> Rebased on your change. This simplified the JIT and interpretation
> code
> quite a bit.
Attached another version.
* tweaked EXPLAIN output some more
* rebased and cleaned up
* Added back the enable_hashagg_disk flag (defaulting to on). I'
On Mon, 2020-02-24 at 15:29 -0800, Andres Freund wrote:
> On 2020-02-22 11:02:16 -0800, Jeff Davis wrote:
> > On Sat, 2020-02-22 at 10:00 -0800, Andres Freund wrote:
> > > Both patches, or just 0013? Seems the earlier one might make the
> > > addition of the opcodes you add less verbose?
> >
> > J
On 2020-02-22 11:02:16 -0800, Jeff Davis wrote:
> On Sat, 2020-02-22 at 10:00 -0800, Andres Freund wrote:
> > Both patches, or just 0013? Seems the earlier one might make the
> > addition of the opcodes you add less verbose?
>
> Just 0013, thank you. 0008 looks like it will simplify things.
Pushe
On Thu, Feb 20, 2020 at 04:56:38PM -0800, Jeff Davis wrote:
On Wed, 2020-02-19 at 20:16 +0100, Tomas Vondra wrote:
1) explain.c currently does this:
I wonder if we could show something for plain explain (without
analyze).
At least the initial estimate of partitions, etc. I know not showing
thos
On Sat, 2020-02-22 at 10:00 -0800, Andres Freund wrote:
> Both patches, or just 0013? Seems the earlier one might make the
> addition of the opcodes you add less verbose?
Just 0013, thank you. 0008 looks like it will simplify things.
Regards,
Jeff Davis
Hi,
On 2020-02-22 09:55:26 -0800, Jeff Davis wrote:
> On Fri, 2020-02-21 at 12:22 -0800, Andres Freund wrote:
> > I'd also like to apply something like 0013 from that thread, I find
> > the
> > whole curperagg, select_current_set, curaggcontext logic confusing as
> > hell. I'd so far planned to pu
On Fri, 2020-02-21 at 12:22 -0800, Andres Freund wrote:
> I'd also like to apply something like 0013 from that thread, I find
> the
> whole curperagg, select_current_set, curaggcontext logic confusing as
> hell. I'd so far planned to put this on the backburner until this
> patch
> has been committe
Hi,
On 2020-02-19 20:16:36 +0100, Tomas Vondra wrote:
> 3) I wonder if we need to invent new opcodes? Wouldn't it be simpler to
> just add a new flag to the agg_* structs instead? I haven't tried hacking
> this, so maybe it's a silly idea.
New opcodes don't really cost that much - it's a jump tab
On Wed, 2020-02-19 at 20:16 +0100, Tomas Vondra wrote:
> 1) explain.c currently does this:
>
> I wonder if we could show something for plain explain (without
> analyze).
> At least the initial estimate of partitions, etc. I know not showing
> those details until after execution is what e.g. sort d
On Wed, Feb 19, 2020 at 08:16:36PM +0100, Tomas Vondra wrote:
> 5) Assert(nbuckets > 0);
> ...
> This however quickly fails on this assert in BuildTupleHashTableExt (see
> backtrace1.txt):
>
> Assert(nbuckets > 0);
>
> The value is computed in hash_choose_num_buckets, and there seem to be
> no
On Wed, Feb 19, 2020 at 08:16:36PM +0100, Tomas Vondra wrote:
> 4) lookup_hash_entries says
>
> /* check to see if we need to spill the tuple for this grouping set */
>
> But that seems bogus, because AFAIK we can't spill tuples for grouping
> sets. So maybe this should say just "grouping"?
As
Hi,
I've started reviewing the 20200218 version of the patch. In general it
seems fine, but I have a couple minor comments and two crashes.
1) explain.c currently does this:
I wonder if we could show something for plain explain (without analyze).
At least the initial estimate of partitions, et
On Tue, Feb 18, 2020 at 10:57 AM Tomas Vondra
wrote:
> Hi,
>
> I wanted to take a look at this thread and do a review, but it's not
> very clear to me if the recent patches posted here are independent or
> how exactly they fit together. I see
>
> 1) hashagg-20200212-1.patch (2020/02/13 by Jeff)
>
Hi,
I wanted to take a look at this thread and do a review, but it's not
very clear to me if the recent patches posted here are independent or
how exactly they fit together. I see
1) hashagg-20200212-1.patch (2020/02/13 by Jeff)
2) refactor.patch (2020/02/13 by Jeff)
3) v1-0001-aggregated-unag
On Wed, 2020-02-12 at 21:51 -0800, Jeff Davis wrote:
> On Mon, 2020-02-10 at 15:57 -0800, Jeff Davis wrote:
> > Attaching latest version (combined logtape changes along with main
> > HashAgg patch).
>
> I ran a matrix of small performance tests to look for regressions.
I ran some more tests, this
On Wed, Jan 8, 2020 at 2:38 AM Heikki Linnakangas wrote:
> This makes the assumption that all Aggrefs or GroupingFuncs are at the
> top of the TargetEntry. That's not true, e.g.:
>
> select 0+sum(a) from foo group by b;
>
> I think find_aggregated_cols() and find_unaggregated_cols() should be
> m
On Wed, 2020-02-12 at 21:51 -0800, Jeff Davis wrote:
> The new patch is basically just rebased -- a few other very minor
> changes.
I extracted out some minor refactoring of nodeAgg.c that I can commit
separately. That will make the main patch a little easier to review.
Attached.
* split build_ha
On Thu, Feb 6, 2020 at 12:01 AM Heikki Linnakangas wrote:
> It wasn't strictly required for what I was hacking on then. IIRC it
> would have saved some memory during sorting, but Peter G felt that it
> wasn't worth the trouble, because he made some other changes around the
> same time, which made
On 05/02/2020 21:56, Jeff Davis wrote:
On Tue, 2020-02-04 at 18:10 +0200, Heikki Linnakangas wrote:
I'd love to change the LogicalTape API so that you could allocate
and
free tapes more freely. I wrote a patch to do that, as part of
replacing
tuplesort.c's polyphase algorithm with a simpler one
On Fri, 2020-01-24 at 17:01 -0800, Jeff Davis wrote:
> New patch attached.
Three minor independent refactoring patches:
1. Add new entry points for the tuple hash table:
TupleHashTableHash()
LookupTupleHashEntryHash()
which are useful for saving and reusing hash values to avoid
recomputing.
On Wed, 2020-02-05 at 11:56 -0800, Jeff Davis wrote:
> Regarding the API, I'd like to change it, but I'm running into some
> performance challenges when adding a layer of indirection. If I apply
> the very simple attached patch, which simply makes a separate
> allocation for the tapes array, it see
On Tue, 2020-02-04 at 18:10 +0200, Heikki Linnakangas wrote:
> I'd love to change the LogicalTape API so that you could allocate
> and
> free tapes more freely. I wrote a patch to do that, as part of
> replacing
> tuplesort.c's polyphase algorithm with a simpler one (see [1]), but
> I
> never go
On Wed, Feb 5, 2020 at 10:37 AM Jeff Davis wrote:
> > LogicalTapeSetExtend() seems to work in a way that assumes that the
> > tape is frozen. It would be good to document that assumption, and
> > possible enforce it by way of an assertion. The same remark applies
> > to
> > any other assumptions y
On Tue, 2020-02-04 at 15:08 -0800, Peter Geoghegan wrote:
> Have you tested this against tuplesort.c, particularly parallel
> CREATE
> INDEX? It would be worth trying to measure any performance impact.
> Note that most parallel CREATE INDEX tuplesorts will do a merge
> within
> each worker, and one
On Tue, 2020-02-04 at 18:42 +0800, Adam Lee wrote:
> The minheap looks good, have you tested the performance and aggregate
> validation?
Not sure exactly what you mean, but I tested the min heap with both
Sort and HashAgg and it performs well.
> About the "Cap the size of the LogicalTapeSet's fre
On Wed, Feb 5, 2020 at 12:08 PM Peter Geoghegan wrote:
> Parallel CREATE INDEX is currently accidentally disabled on the master
> branch. That should be fixed in the next couple of days. You can
> temporarily revert 74618e77 if you want to get it back for testing
> purposes today.
(Fixed -- sorry
On Mon, Feb 3, 2020 at 6:24 PM Jeff Davis wrote:
> And now I'm attaching another version of the main Hash Aggregation
> patch to be applied on top of the logtape.c patch.
Have you tested this against tuplesort.c, particularly parallel CREATE
INDEX? It would be worth trying to measure any performa
On 03/02/2020 20:29, Jeff Davis wrote:
1. Use a minheap for the freelist. The original design used an array
that had to be sorted between a read (which frees a block) and a write
(which needs to sort the array to consume the lowest block number). The
comments said:
* sorted. This is an effic
On Mon, Feb 03, 2020 at 06:24:14PM -0800, Jeff Davis wrote:
> On Mon, 2020-02-03 at 10:29 -0800, Jeff Davis wrote:
> > I ended up converting the freelist to a min heap.
> >
> > Attached is a patch which makes three changes to better support
> > HashAgg:
>
> And now I'm attaching another version o
On Wed, 2020-01-29 at 14:48 -0800, Jeff Davis wrote:
> 2) Use a different structure more capable of handling a large
> fraction
> of free space. A compressed bitmap might make sense, but that seems
> like overkill to waste effort tracking a lot of space that is
> unlikely
> to ever be used.
I ende
On Fri, 2020-01-24 at 17:16 -0800, Peter Geoghegan wrote:
> That sounds weird. Might be pathological in some sense.
>
> I have a wild guess for you. Maybe this has something to do with the
> "test for presorted input" added by commit a3f0b3d68f9. That can
> perform very badly when the input is alm
On Fri, Jan 24, 2020 at 5:01 PM Jeff Davis wrote:
> Unfortunately, I'm seeing some bad behavior (at least in some cases)
> with logtape.c, where it's spending a lot of time qsorting the list of
> free blocks. Adam, did you also see this during your perf tests? It
> seems to be worst with lower wor
On 28/12/2019 01:35, Jeff Davis wrote:
I've attached a new patch that adds some basic costing for disk during
hashagg.
This patch (hashagg-20191227.patch) doesn't compile:
nodeAgg.c:3379:7: error: ‘hashagg_mem_overflow’ undeclared (first use in
this function)
if (hashagg_mem_overflow)
Hi, Jeff
I tried to use the logical tape APIs for hash agg spilling, based on
your 1220 version.
Turns out it doesn't make much of performance difference with the
default 8K block size (might be my patch's problem), but the disk space
(not I/O) would be saved a lot because I force the respilling
On Sat, 2019-12-14 at 18:32 +0100, Tomas Vondra wrote:
> So I think we're not costing the batching properly / at all.
Hi,
I've attached a new patch that adds some basic costing for disk during
hashagg.
The accuracy is unfortunately not great, especially at smaller work_mem
sizes and smaller entr
On Sat, 2019-12-14 at 18:32 +0100, Tomas Vondra wrote:
> So I think we're not costing the batching properly / at all.
Thank you for all of the testing! I think the results are good: even
for cases where HashAgg is the wrong choice, it's not too bad. You're
right that costing is not done, and when
On Tue, 2019-12-10 at 13:34 -0800, Adam Lee wrote:
> Melanie and I tried this, had a installcheck passed patch. The way
> how
> we verify it is composing a wide table with long unnecessary text
> columns, then check the size it writes on every iteration.
>
> Please check out the attachment, it's b
On Sat, Dec 14, 2019 at 06:32:25PM +0100, Tomas Vondra wrote:
> I've done a bit more testing on this, after resolving a couple of minor
> conflicts due to recent commits (rebased version attached).
>
> In particular, I've made a comparison with different dataset sizes,
> group sizes, GUC settings
On Wed, Nov 27, 2019 at 02:58:04PM -0800, Jeff Davis wrote:
On Wed, 2019-08-28 at 12:52 -0700, Taylor Vesely wrote:
Right now the patch always initializes 32 spill partitions. Have you
given
any thought into how to intelligently pick an optimal number of
partitions yet?
Attached a new patch th
On Thu, Dec 12, 2019 at 06:10:50PM -0800, Jeff Davis wrote:
On Thu, 2019-11-28 at 18:46 +0100, Tomas Vondra wrote:
13) As for this:
/* make sure that we don't exhaust the hash bits */
if (partition_bits + input_bits >= 32)
partition_bits = 32 - input_bits;
We already ran int
On Thu, 2019-11-28 at 18:46 +0100, Tomas Vondra wrote:
> 13) As for this:
>
> /* make sure that we don't exhaust the hash bits */
> if (partition_bits + input_bits >= 32)
> partition_bits = 32 - input_bits;
>
> We already ran into this issue (exhausting bits in a hash value) in
On Wed, Dec 04, 2019 at 10:57:51PM -0800, Jeff Davis wrote:
> > About the `TODO: project needed attributes only` in your patch, when
> > would the input tuple contain columns not needed? It seems like
> > anything
> > you can project has to be in the group or aggregates.
>
> If you have a table li
On 2019-12-05 12:55:51 -0800, Jeff Davis wrote:
> On Thu, 2019-11-28 at 18:46 +0100, Tomas Vondra wrote:
> > And it's not clear to me why we should remove part of the comment
> > before
> > TupleHashTableHash.
>
> It looks like 5dfc1981 changed the signature of TupleHashTableHash
> without updatin
On Thu, Dec 05, 2019 at 12:55:51PM -0800, Jeff Davis wrote:
On Thu, 2019-11-28 at 18:46 +0100, Tomas Vondra wrote:
And it's not clear to me why we should remove part of the comment
before
TupleHashTableHash.
It looks like 5dfc1981 changed the signature of TupleHashTableHash
without updating th
On Thu, 2019-11-28 at 18:46 +0100, Tomas Vondra wrote:
> And it's not clear to me why we should remove part of the comment
> before
> TupleHashTableHash.
It looks like 5dfc1981 changed the signature of TupleHashTableHash
without updating the comment, so it doesn't really make sense any more.
I jus
On Wed, 2019-12-04 at 17:24 -0800, Melanie Plageman wrote:
>
> It looks like the parameter input_tuples passed to hash_spill_init()
> in lookup_hash_entries() is the number of groups estimated by
> planner.
> However, when reloading a spill file, if we run out of memory and
> re-spill, hash_spill
On Wed, 2019-12-04 at 19:50 -0800, Adam Lee wrote:
> On Wed, Dec 04, 2019 at 06:55:43PM -0800, Jeff Davis wrote:
> >
> > Thanks very much for a great review! I've attached a new patch.
>
> Hi,
>
> About the `TODO: project needed attributes only` in your patch, when
> would the input tuple contai
On Wed, Dec 04, 2019 at 06:55:43PM -0800, Jeff Davis wrote:
>
> Thanks very much for a great review! I've attached a new patch.
Hi,
About the `TODO: project needed attributes only` in your patch, when
would the input tuple contain columns not needed? It seems like anything
you can project has to
Thanks very much for a great review! I've attached a new patch.
There are some significant changes in the new version also:
In the non-spilling path, removed the extra nullcheck branch in the
compiled evaltrans expression. When the first tuple is spilled, I the
branch becomes necessary, so I rec
On Thu, Nov 28, 2019 at 9:47 AM Tomas Vondra
wrote:
> On Wed, Nov 27, 2019 at 02:58:04PM -0800, Jeff Davis wrote:
> >On Wed, 2019-08-28 at 12:52 -0700, Taylor Vesely wrote:
> >> Right now the patch always initializes 32 spill partitions. Have you
> >> given
> >> any thought into how to intelligen
On Wed, Nov 27, 2019 at 02:58:04PM -0800, Jeff Davis wrote:
On Wed, 2019-08-28 at 12:52 -0700, Taylor Vesely wrote:
Right now the patch always initializes 32 spill partitions. Have you
given
any thought into how to intelligently pick an optimal number of
partitions yet?
Attached a new patch th
On Wed, 2019-08-28 at 12:52 -0700, Taylor Vesely wrote:
> Right now the patch always initializes 32 spill partitions. Have you
> given
> any thought into how to intelligently pick an optimal number of
> partitions yet?
Attached a new patch that addresses this.
1. Divide hash table memory used by
On Wed, 2019-08-28 at 12:52 -0700, Taylor Vesely wrote:
> I started to review this patch yesterday with Melanie Plageman, so we
> rebased this patch over the current master. The main conflicts were
> due to a simplehash patch that has been committed separately[1]. I've
> attached the rebased patch.
I started to review this patch yesterday with Melanie Plageman, so we
rebased this patch over the current master. The main conflicts were
due to a simplehash patch that has been committed separately[1]. I've
attached the rebased patch.
I was playing with the code, and if one of the table's most co
On Fri, Aug 02, 2019 at 08:11:19AM -0700, Jeff Davis wrote:
On Fri, 2019-08-02 at 14:44 +0800, Adam Lee wrote:
I'm late to the party.
You are welcome to join any time!
These two approaches both spill the input tuples, what if the skewed
groups are not encountered before the hash table fills
On Fri, 2019-08-02 at 14:44 +0800, Adam Lee wrote:
> I'm late to the party.
You are welcome to join any time!
> These two approaches both spill the input tuples, what if the skewed
> groups are not encountered before the hash table fills up? The spill
> files' size and disk I/O could be downsides
> High-level approaches:
>
> 1. When the in-memory hash table fills, keep existing entries in the
> hash table, and spill the raw tuples for all new groups in a
> partitioned fashion. When all input tuples are read, finalize groups
> in memory and emit. Now that the in-memory hash table is cleared
On Thu, Jul 11, 2019 at 06:06:33PM -0700, Jeff Davis wrote:
On Thu, 2019-07-11 at 17:55 +0200, Tomas Vondra wrote:
Makes sense. I haven't thought about how the hybrid approach would be
implemented very much, so I can't quite judge how complicated would
it be
to extend "approach 1" later. But if
On Thu, 2019-07-11 at 17:55 +0200, Tomas Vondra wrote:
> Makes sense. I haven't thought about how the hybrid approach would be
> implemented very much, so I can't quite judge how complicated would
> it be
> to extend "approach 1" later. But if you think it's a sensible first
> step,
> I trust you.
On Wed, Jul 03, 2019 at 07:03:06PM -0700, Jeff Davis wrote:
On Wed, 2019-07-03 at 02:17 +0200, Tomas Vondra wrote:
What does "partitioned hash strategy" do? It's probably explained in
one
of the historical discussions, but I'm not sure which one. I assume
it
simply hashes the group keys and uses
On Wed, Jul 03, 2019 at 07:03:06PM -0700, Jeff Davis wrote:
On Wed, 2019-07-03 at 02:17 +0200, Tomas Vondra wrote:
What does "partitioned hash strategy" do? It's probably explained in
one
of the historical discussions, but I'm not sure which one. I assume
it
simply hashes the group keys and uses
On Wed, 2019-07-03 at 02:17 +0200, Tomas Vondra wrote:
> What does "partitioned hash strategy" do? It's probably explained in
> one
> of the historical discussions, but I'm not sure which one. I assume
> it
> simply hashes the group keys and uses that to partition the data, and
> then
> passing it
On Mon, 2019-07-01 at 12:13 -0700, Jeff Davis wrote:
> This is for design review. I have a patch (WIP) for Approach 1, and
> if
> this discussion starts to converge on that approach I will polish and
> post it.
WIP patch attached (based on 9a81c9fa); targeting September CF.
Not intended for detai
Hi Jeff,
On Mon, Jul 01, 2019 at 12:13:53PM -0700, Jeff Davis wrote:
This is for design review. I have a patch (WIP) for Approach 1, and if
this discussion starts to converge on that approach I will polish and
post it.
Thanks for working on this.
Let's start at the beginning: why do we have
74 matches
Mail list logo