Re: hashagg slowdown due to spill changes

2020-07-27 Thread Peter Geoghegan
On Sun, Jul 26, 2020 at 4:17 PM Jeff Davis wrote: > I saw less of an improvement than you or Andres (perhaps just more > noise). But given that both you and Andres are reporting a measurable > improvement, then I went ahead and committed it. Thank you. Thanks! -- Peter Geoghegan

Re: hashagg slowdown due to spill changes

2020-07-26 Thread Jeff Davis
On Sat, 2020-07-25 at 15:08 -0700, Peter Geoghegan wrote: > I find that Andres original "SELECT cat, count(*) FROM > fewgroups_many_rows GROUP BY 1;" test case is noticeably improved by > the patch. Without the patch, v13 takes ~11.46 seconds. With the > patch, it takes only ~10.64 seconds. I saw

Re: hashagg slowdown due to spill changes

2020-07-25 Thread Peter Geoghegan
On Sat, Jul 25, 2020 at 12:41 PM Peter Geoghegan wrote: > I have added a new open item for this separate > LookupTupleHashEntryHash()/lookup_hash_entry() pipeline-stall issue. Attached is a rebased version of Andres' now-bitrot 2020-06-12 patch ("aggspeed.diff"). I find that Andres original "SEL

Re: hashagg slowdown due to spill changes

2020-07-25 Thread Peter Geoghegan
On Fri, Jul 24, 2020 at 4:51 PM Andres Freund wrote: > This is still not resolved. We're right now slower than 12. It's > effectively not possible to do performance comparisons right now. This > was nearly two months ago. I have added a new open item for this separate LookupTupleHashEntryHash()/l

Re: hashagg slowdown due to spill changes

2020-07-24 Thread Andres Freund
Hi, On 2020-06-12 14:37:15 -0700, Andres Freund wrote: > On 2020-06-11 11:14:02 -0700, Jeff Davis wrote: > > On Thu, 2020-06-11 at 10:45 -0700, Andres Freund wrote: > > > Did you run any performance tests? > > > > Yes, I reproduced your ~12% regression from V12, and this patch nearly > > eliminate

Re: hashagg slowdown due to spill changes

2020-06-25 Thread Tomas Vondra
On Wed, Jun 24, 2020 at 05:26:07PM -0700, Melanie Plageman wrote: On Tue, Jun 23, 2020 at 10:06 AM Andres Freund wrote: Hi, On 2020-06-23 09:23:57 -0700, Melanie Plageman wrote: > On Mon, Jun 22, 2020 at 9:02 PM Andres Freund wrote: > > It's not this patch's fault, but none, really none, of

Re: hashagg slowdown due to spill changes

2020-06-24 Thread Melanie Plageman
On Tue, Jun 23, 2020 at 10:06 AM Andres Freund wrote: > Hi, > > On 2020-06-23 09:23:57 -0700, Melanie Plageman wrote: > > On Mon, Jun 22, 2020 at 9:02 PM Andres Freund > wrote: > > > It's not this patch's fault, but none, really none, of this stuff > should > > > be in the executor. > > > > > >

Re: hashagg slowdown due to spill changes

2020-06-23 Thread Andres Freund
Hi, On 2020-06-23 09:23:57 -0700, Melanie Plageman wrote: > On Mon, Jun 22, 2020 at 9:02 PM Andres Freund wrote: > > It's not this patch's fault, but none, really none, of this stuff should > > be in the executor. > > > > > Were you thinking it could be done in grouping_planner() and then the > b

Re: hashagg slowdown due to spill changes

2020-06-23 Thread Melanie Plageman
On Mon, Jun 22, 2020 at 9:02 PM Andres Freund wrote: > > > /* > > - * find_unaggregated_cols > > - * Construct a bitmapset of the column numbers of un-aggregated Vars > > - * appearing in our targetlist and qual (HAVING clause) > > + * Walk tlist and qual to find referenced colnos, divid

Re: hashagg slowdown due to spill changes

2020-06-22 Thread Andres Freund
Hi, I think it'd be good to get the changes that aren't related to projection merged. As far as I can tell there's performance regressions both because of the things I'd listed upthread, and due to the projection issue. That's not obvious because because we first won performance and then lost it

Re: hashagg slowdown due to spill changes

2020-06-16 Thread Tomas Vondra
On Mon, Jun 15, 2020 at 07:38:45PM -0700, Jeff Davis wrote: On Mon, 2020-06-15 at 11:19 -0400, Robert Haas wrote: On Mon, Jun 15, 2020 at 9:34 AM Tomas Vondra wrote: > But just reverting 4cad2534d will make this much worse, I think, as > illustrated by the benchmarks I did in [1]. I share this

Re: hashagg slowdown due to spill changes

2020-06-15 Thread Jeff Davis
On Mon, 2020-06-15 at 11:19 -0400, Robert Haas wrote: > On Mon, Jun 15, 2020 at 9:34 AM Tomas Vondra > wrote: > > But just reverting 4cad2534d will make this much worse, I think, as > > illustrated by the benchmarks I did in [1]. > > I share this concern, although I do not know what we should do

Re: hashagg slowdown due to spill changes

2020-06-15 Thread Tom Lane
Robert Haas writes: > On Mon, Jun 15, 2020 at 9:34 AM Tomas Vondra > wrote: >> But just reverting 4cad2534d will make this much worse, I think, as >> illustrated by the benchmarks I did in [1]. > I share this concern, although I do not know what we should do about it. Well, it's only June. Let

Re: hashagg slowdown due to spill changes

2020-06-15 Thread Robert Haas
On Mon, Jun 15, 2020 at 9:34 AM Tomas Vondra wrote: > But just reverting 4cad2534d will make this much worse, I think, as > illustrated by the benchmarks I did in [1]. I share this concern, although I do not know what we should do about it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.c

Re: hashagg slowdown due to spill changes

2020-06-15 Thread Tomas Vondra
On Sun, Jun 14, 2020 at 11:09:55PM -0700, Jeff Davis wrote: On Sun, 2020-06-14 at 11:14 -0700, Andres Freund wrote: I'm somewhat inclined to think that we should revert 4cad2534da6 and then look at how precisely to tackle this in 14. I'm fine with that. I don't see how we could just revert

Re: hashagg slowdown due to spill changes

2020-06-14 Thread Jeff Davis
On Sun, 2020-06-14 at 11:14 -0700, Andres Freund wrote: > I'm somewhat inclined to think that we should revert 4cad2534da6 and > then look at how precisely to tackle this in 14. I'm fine with that. > It'd probably make sense to request small tlists when the number of > estimated groups is large,

Re: hashagg slowdown due to spill changes

2020-06-14 Thread Andres Freund
Hi, On 2020-06-12 15:29:08 -0700, Jeff Davis wrote: > On Fri, 2020-06-12 at 14:37 -0700, Andres Freund wrote: > > I don't see why it's ok to force an additional projection in the very > > common case of hashaggs over a few rows. So I think we need to > > rethink > > 4cad2534da6. > > One possibili

Re: hashagg slowdown due to spill changes

2020-06-13 Thread Tomas Vondra
On Sat, Jun 13, 2020 at 11:48:09AM -0700, Jeff Davis wrote: On Fri, 2020-06-12 at 17:12 -0700, Andres Freund wrote: Do you think we should tackle this for 13? To me 4cad2534da seems like a somewhat independent improvement to spillable hashaggs. We've gone back and forth on this issue a few tim

Re: hashagg slowdown due to spill changes

2020-06-13 Thread Jeff Davis
On Fri, 2020-06-12 at 17:12 -0700, Andres Freund wrote: > Do you think we should tackle this for 13? To me 4cad2534da seems > like a > somewhat independent improvement to spillable hashaggs. We've gone back and forth on this issue a few times, so let's try to get some agreement before we revert 4c

Re: hashagg slowdown due to spill changes

2020-06-12 Thread Andres Freund
Hi, On 2020-06-13 01:06:25 +0200, Tomas Vondra wrote: > I agree, we should revert 4cad2534da and only project tuples when we > actually need to spill them. There are cases where projecting helps for non-spilling aggregates too, but only for the representative tuple. It doesn't help in the case at

Re: hashagg slowdown due to spill changes

2020-06-12 Thread Tomas Vondra
On Fri, Jun 12, 2020 at 03:29:08PM -0700, Jeff Davis wrote: On Fri, 2020-06-12 at 14:37 -0700, Andres Freund wrote: I don't see why it's ok to force an additional projection in the very common case of hashaggs over a few rows. So I think we need to rethink 4cad2534da6. One possibility is to pr

Re: hashagg slowdown due to spill changes

2020-06-12 Thread Jeff Davis
On Fri, 2020-06-12 at 14:37 -0700, Andres Freund wrote: > I don't see why it's ok to force an additional projection in the very > common case of hashaggs over a few rows. So I think we need to > rethink > 4cad2534da6. One possibility is to project only spilled tuples, more similar to Melanie's pat

Re: hashagg slowdown due to spill changes

2020-06-12 Thread Andres Freund
Hi, On 2020-06-11 11:14:02 -0700, Jeff Davis wrote: > On Thu, 2020-06-11 at 10:45 -0700, Andres Freund wrote: > > Did you run any performance tests? > > Yes, I reproduced your ~12% regression from V12, and this patch nearly > eliminated it for me. I spent a fair bit of time looking at the differe

Re: hashagg slowdown due to spill changes

2020-06-11 Thread Jeff Davis
On Thu, 2020-06-11 at 10:45 -0700, Andres Freund wrote: > Did you run any performance tests? Yes, I reproduced your ~12% regression from V12, and this patch nearly eliminated it for me. Regards, Jeff Davis

Re: hashagg slowdown due to spill changes

2020-06-11 Thread Andres Freund
Hi, On June 10, 2020 6:15:39 PM PDT, Jeff Davis wrote: >On Fri, 2020-06-05 at 21:11 -0700, Andres Freund wrote: >> Hi, >> >> While preparing my pgcon talk I noticed that our hash-agg performance >> degraded noticeably. Looks to me like it's due to the spilling- >> hashagg >> changes. > >Attache

Re: hashagg slowdown due to spill changes

2020-06-10 Thread Jeff Davis
On Fri, 2020-06-05 at 21:11 -0700, Andres Freund wrote: > Hi, > > While preparing my pgcon talk I noticed that our hash-agg performance > degraded noticeably. Looks to me like it's due to the spilling- > hashagg > changes. Attached a proposed fix. (Might require some minor cleanup). The only awk

Re: hashagg slowdown due to spill changes

2020-06-08 Thread Andres Freund
Hi, On 2020-06-08 13:41:29 -0700, Jeff Davis wrote: > On Fri, 2020-06-05 at 21:11 -0700, Andres Freund wrote: > > Before there was basically one call from nodeAgg.c to execGrouping.c > > for > > each tuple and hash table. Now it's a lot more complicated: > > 1) nodeAgg.c: prepare_hash_slot() > > 2

Re: hashagg slowdown due to spill changes

2020-06-08 Thread Jeff Davis
On Fri, 2020-06-05 at 21:11 -0700, Andres Freund wrote: > Why isn't the flow more like this: > 1) prepare_hash_slot() > 2) if (aggstate->hash_spill_mode) goto 3; else goto 4 > 3) entry = LookupTupleHashEntry(&hash); if (!entry) > hashagg_spill_tuple(); > 4) InsertTupleHashEntry(&hash, &isnew); if (

Re: hashagg slowdown due to spill changes

2020-06-08 Thread Jeff Davis
On Fri, 2020-06-05 at 21:11 -0700, Andres Freund wrote: > Before there was basically one call from nodeAgg.c to execGrouping.c > for > each tuple and hash table. Now it's a lot more complicated: > 1) nodeAgg.c: prepare_hash_slot() > 2) execGrouping.c: TupleHashTableHash() > 3) nodeAgg.c: lookup_has

Re: hashagg slowdown due to spill changes

2020-06-08 Thread Alvaro Herrera
On 2020-Jun-05, Andres Freund wrote: > While preparing my pgcon talk I noticed that our hash-agg performance > degraded noticeably. Looks to me like it's due to the spilling-hashagg > changes. Jeff, what are your thoughts on this? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ Po

hashagg slowdown due to spill changes

2020-06-05 Thread Andres Freund
Hi, While preparing my pgcon talk I noticed that our hash-agg performance degraded noticeably. Looks to me like it's due to the spilling-hashagg changes. Sample benchmark: config: -c huge_pages=on -c shared_buffers=32GB -c jit=0 -c max_parallel_workers_per_gather=0 (largely just to reduce varia