On Sat, Jul 20, 2019 at 11:25 AM Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: ... > >My current line of investigation is whether we need to do anything in > >the parallel portion of create_ordered_paths(). I noticed that the > >first-pass patch adding generate_useful_gather_paths() modified that > >section but wasn't actually adding any new gather-merge paths (just > >bare incremental sort paths). That seems pretty clearly just a > >prototype miss, so I modified the prototype to build gather-merge > >paths instead (as a side note that change seems to fix an oddity I was > >seeing where plans would include a parallel index scan node even > >though they weren't parallel plans). While the resulting plan for > >something like: > > > > Yes, that seems to be a bug. The block above it clealy has a gather > merge nodes, so this one should too. > > >explain analyze select * from t where t.a in (1,2,3,4,5,6) order by > >t.a, t.b limit 50; > > > >changes cost (to be cheaper) ever so slightly with the gather-merge > >addition to create_ordered_paths(), the plan itself is otherwise > >identical (including row estimates): > > > >Limit > > -> Gather Merge > > -> Incremental Sort > > -> Parallel Index Scan > > > >(Note: I'm forcing parallel plans here with: set > >max_parallel_workers_per_gather=4; set min_parallel_table_scan_size=0; > >set parallel_tuple_cost=0; set parallel_setup_cost=0; set > >min_parallel_index_scan_size=0;) > > > >I can't seem to come up with a case where adding these gather-merge > >paths in create_ordered_paths() isn't entirely duplicative of paths > >already created by generate_useful_gather_paths() as called from > >apply_scanjoin_target_to_paths() -- which I _think_ makes sense given > >that both apply_scanjoin_target_to_paths() and create_ordered_paths() > >are called by grouping_planner(). > > > >Can you think of a case I'm missing here that would make it valuable > >to generate new parallel plans in create_ordered_paths()? > > > > Good question. Not sure. I think such path would need to do something on > a relation that is neither a join nor a scan - in which case the path > should not be created by apply_scanjoin_target_to_paths(). > > So for example a query like this: > > SELECT > a, b, sum(expensive_function(c)) > FROM > t > GROUP BY a,b > ORDER BY a,sum(...) LIMIT 10; > > should be able to produce a plan like this: > > -> Limit > -> Gather Merge > -> Incremental Sort (pathkeys: a, sum) > -> Group Aggregate > a, b, sum(expensive_function(c)) > -> Index Scan (pathkeys: a, b) > > or something like that, maybe. I haven't tried this, though. The other > question is whether such queries are useful in practice ...
Hmm, when I step through on that example input_rel->partial_pathlist != NIL is false, so we don't even attempt to consider any extra parallel paths in create_ordered_paths(). Nonetheless we get a parallel plan, but with a different shape: Limit -> Incremental Sort Sort Key: a, (sum(expensive_function(c))) Presorted Key: a -> Finalize GroupAggregate Group Key: a, b -> Gather Merge Workers Planned: 4 -> Partial GroupAggregate Group Key: a, b -> Sort Sort Key: a, b -> Parallel Seq Scan on t (or if I disable seqscan then the sort+seq scan above becomes inc sort + index scan) To be honest, I don't think I understand how you would get a plan like that anyway since the index here only has "a" and so can't provide (pathkeys: a, b). Perhaps there's something I'm still missing though. James Coleman