On Tue, Apr 28, 2020 at 11:50 PM Heikki Linnakangas <hlinn...@iki.fi> wrote:
> On 29/04/2020 05:03, Melanie Plageman wrote: > > I've attached a patch which should address some of the previous feedback > > about code complexity. Two of my co-workers and I wrote what is > > essentially a new prototype of the idea. It uses the main state machine > > to route emitting unmatched tuples instead of introducing a separate > > state. The logic for falling back is also more developed. > > I haven't looked at the patch in detail, but thanks for the commit > message; it describes very well what this is all about. It would be nice > to copy that explanation to the top comment in nodeHashJoin.c in some > form. I think we're missing a high level explanation of how the batching > works even before this new patch, and that commit message does a good > job at it. > > Thanks for taking a look, Heikki! I made a few edits to the message and threw it into a draft patch (on top of master, of course). I didn't want to junk up peoples' inboxes, so I didn't start a separate thread, but, it will be pretty hard to collaboratively edit the comment/ever register it for a commitfest if it is wedged into this thread. What do you think? -- Melanie Plageman
From 1deb1d777693ffcb73c96130ac51b282cd968577 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Thu, 30 Apr 2020 07:16:28 -0700 Subject: [PATCH v1] Describe hybrid hash join implementation This is just a draft to spark conversation on what a good comment might be like in this file on how the hybrid hash join algorithm is implemented in Postgres. I'm pretty sure this is the accepted term for this algorithm https://en.wikipedia.org/wiki/Hash_join#Hybrid_hash_join --- src/backend/executor/nodeHashjoin.c | 36 +++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c index cc8edacdd0..86bfdaef7f 100644 --- a/src/backend/executor/nodeHashjoin.c +++ b/src/backend/executor/nodeHashjoin.c @@ -10,6 +10,42 @@ * IDENTIFICATION * src/backend/executor/nodeHashjoin.c * + * HYBRID HASH JOIN + * + * If the inner side tuples of a hash join do not fit in memory, the hash join + * can be executed in multiple batches. + * + * If the statistics on the inner side relation are accurate, planner chooses a + * multi-batch strategy and estimates the number of batches. + * + * The query executor measures the real size of the hashtable and increases the + * number of batches if the hashtable grows too large. + * + * The number of batches is always a power of two, so an increase in the number + * of batches doubles it. + * + * Serial hash join measures batch size lazily -- waiting until it is loading a + * batch to determine if it will fit in memory. While inserting tuples into the + * hashtable, serial hash join will, if that tuple were to exceed work_mem, + * dump out the hashtable and reassign them either to other batch files or the + * current batch resident in the hashtable. + * + * Parallel hash join, on the other hand, completes all changes to the number + * of batches during the build phase. If it increases the number of batches, it + * dumps out all the tuples from all batches and reassigns them to entirely new + * batch files. Then it checks every batch to ensure it will fit in the space + * budget for the query. + * + * In both parallel and serial hash join, the executor currently makes a best + * effort. If a particular batch will not fit in memory, it tries doubling the + * number of batches. If after a batch increase, there is a batch which + * retained all or none of its tuples, the executor disables growth in the + * number of batches globally. After growth is disabled, all batches that would + * have previously triggered an increase in the number of batches instead + * exceed the space allowed. + * + * TODO: should we discuss that tuples can only spill forward? + * * PARALLELISM * * Hash joins can participate in parallel query execution in several ways. A -- 2.20.1