Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

Kevin Grittner Wed, 01 Oct 2014 17:22:20 -0700

Tomas Vondra <t...@fuzzy.cz> wrote:
> On 12.9.2014 23:22, Robert Haas wrote:


>> My first thought is to revert to NTUP_PER_BUCKET=1, but it's
>> certainly arguable. Either method, though, figures to be better than
>> doing nothing, so let's do something.
>
> OK, but can we commit the remaining part first? Because it'll certainly
> interact with this proposed part, and it's easier to tweak when the code
> is already there. Instead of rebasing over and over.

The patch applied and built without problem, and pass `make
check-world`.  The only thing that caught my eye was the addition
of debug code using printf() instead of logging at a DEBUG level.
Is there any reason for that?

I still need to work through the (rather voluminous) email threads
and the code to be totally comfortable with all the issues, but
if Robert and Heikki are comfortable with it, I'm not gonna argue.

Preliminary benchmarks look outstanding!  I tried to control
carefully to ensure consistent results (by dropping, recreating,
vacuuming, analyzing, and checkpointing before each test), but
there were surprising outliers in the unpatched version.  It turned
out that it was because slight differences in the random samples
caused different numbers of buckets for both unpatched and patched,
but the patched version dealt with the difference gracefully while
the unpatched version's performance fluctuated randomly.

My thinking is that we should get this much committed and then
discuss further optimizations.  I tried to throw something at it
that would be something of a "worst case" because with the new code
it would start with one batch and go to two.

Five runs per test, alternating between unpatched and patched.

First I tried with the default work_mem size of 4MB:

Planning time / Execution time (ms)

unpatched, work_mem = 4MB

0.694 / 10392.930
0.327 / 10388.787
0.412 / 10533.036
0.650 / 17186.719
0.338 / 10707.423

Fast: Buckets: 2048  Batches: 1  Memory Usage: 3516kB
Slow: Buckets: 1024  Batches: 1  Memory Usage: 3516kB

patched, work_mem = 4MB

0.764 / 5021.792
0.655 / 4991.887
0.436 / 5068.777
0.410 / 5057.902
0.444 / 5021.543

1st & 5th:  Buckets: 131072 (originally 1024)  Batches: 2 (originally 1)  
Memory Usage: 3073kB
all others: Buckets: 131072 (originally 2048)  Batches: 2 (originally 1)  
Memory Usage: 3073kB

Then, just to see how both dealt with extra memory I did it again with 1GB:

unpatched, work_mem = 1GB

0.328 / 10407.836
0.319 / 10465.348
0.324 / 16606.948
0.569 / 10408.671
0.542 / 16996.209

Memory usage same as before.  Guess which runs started with 1024 buckets.  ;-)

0.556 / 3974.741
0.325 / 4012.613
0.334 / 3941.734
0.834 / 3894.391
0.603 / 3959.440

2nd & 4th:  Buckets: 131072 (originally 2048)  Batches: 1 (originally 1)  
Memory Usage: 4540kB
all others: Buckets: 131072 (originally 1024)  Batches: 1 (originally 1)  
Memory Usage: 4540kB

My only concern from the benchmarks is that it seemed like there
was a statistically significant increase in planning time:

unpatched plan time average: 0.450 ms
patched plan time average:   0.536 ms

That *might* just be noise, but it seems likely to be real.  For
the improvement in run time, I'd put up with an extra 86us in
planning time per hash join; but if there's any way to shave some
of that off, all the better.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

Reply via email to