On Fri, Oct 4, 2019 at 8:36 AM Robert Haas <robertmh...@gmail.com> wrote: > > On Wed, Oct 2, 2019 at 10:22 AM James Coleman <jtc...@gmail.com> wrote: > > In all cases I've been starting with: > > > > set enable_hashjoin = off; > > set enable_nestloop = off; > > set max_parallel_workers_per_gather = 4; > > set min_parallel_index_scan_size = 0; > > set min_parallel_table_scan_size = 0; > > set parallel_setup_cost = 0; > > set parallel_tuple_cost = 0; > > > > I've also tried various combinations of random_page_cost, > > cpu_index_tuple_cost, cpu_tuple_cost. > > > > Interestingly I've noticed plans joining two relations that look like: > > > > Limit > > -> Merge Join > > Merge Cond: (t1.pk = t2.pk) > > -> Gather Merge > > Workers Planned: 4 > > -> Parallel Index Scan using t_pkey on t t1 > > -> Gather Merge > > Workers Planned: 4 > > -> Parallel Index Scan using t_pkey on t t2 > > > > Where I would have expected a Gather Merge above a parallelized merge > > join. Is that reasonable to expect? > > Well, you told the planner that parallel_setup_cost = 0, so starting > workers is free. And you told the planner that parallel_tuple_cost = > 0, so shipping tuples from the worker to the leader is also free. So > it is unclear why it should prefer a single Gather Merge over two > Gather Merges: after all, the Gather Merge is free! > > If you use give those things some positive cost, even if it's smaller > than the default, you'll probably get a saner-looking plan choice.
That makes sense. Right now I currently see trying to get this a separate test feels a bit like a distraction. Given there doesn't seem to be an obvious way to reproduce the issue currently, but we know we have a reproduction example along with incremental sort, what is the path forward for this? Is it reasonable to try to commit it anyway knowing that it's a "correct" change and been demonstrated elsewhere? James