On Wed, Apr 29, 2020 at 4:39 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > > In addition to many assorted TODOs in the code, there are a few major > projects left: > - Batch 0 falling back > - Stripe barrier deadlock > - Performance improvements and testing >
Batch 0 never spills. That behavior is an artifact of the existing design that as an optimization special cases batch 0 to fill the initial hash table. This means it can skip loading and doesn't need to create a batch file. However in the pathalogical case where all tuples hash to batch 0 there is no way to redistribute those tuples to other batches. So, existing hash join implementation allows work_mem to be exceeded for batch 0. In adaptive hash join approach, there is another way to deal with a batch that exceeds work_mem. If increasing the number of batches does not work then the batch can be split into stripes that will not exceed work_mem. Doing this requires spilling the excess tuples to batch files. Following patch adds logic to create a batch 0 file for serial hash join so that even in pathalogical case we do not need to exceed work_mem. Thanks, David
v6-0002-Implement-fallback-of-batch-0-for-serial-adaptive.patch
Description: Binary data