On Thu, Jan 12, 2017 at 8:50 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Sun, Dec 4, 2016 at 7:36 PM, Haribabu Kommi <kommi.harib...@gmail.com> > wrote: > > On Thu, Nov 24, 2016 at 11:12 PM, Rushabh Lathia < > rushabh.lat...@gmail.com> > > wrote: > >> PFA latest patch with fix as well as few cosmetic changes. > > > > Moved to next CF with "needs review" status. > > I spent quite a bit of time on this patch over the last couple of > days. I was hoping to commit it, but I think it's not quite ready for > that yet and I hit a few other issues along the way. Meanwhile, > here's an updated version with the following changes: > > * Adjusted cost_gather_merge because we don't need to worry about less > than 1 worker. > * Don't charge double maintenance cost of the heap per 34ca0905. This > was pointed out previous and Rushabh said it was fixed, but it wasn't > fixed in v5. > * cost_gather_merge claimed to charge a slightly higher IPC cost > because we have to block, but didn't. Fix it so it does. > * Move several hunks to more appropriate places in the file, near > related code or in a more logical position relative to surrounding > code. > * Fixed copyright dates for the new files. One said 2015, one said 2016. > * Removed unnecessary code from create_gather_merge_plan that tried to > handle an empty list of pathkeys (shouldn't happen). > * Make create_gather_merge_plan more consistent with > create_merge_append_plan. Remove make_gather_merge for the same > reason. > * Changed generate_gather_paths to generate gather merge paths. In > the previous coding, only the upper planner nodes ever tried to > generate gather merge nodes, but that seems unnecessarily limiting, > since it could be useful to generate a gathered path with pathkeys at > any point in the tree where we'd generate a gathered path with no > pathkeys. > * Rewrote generate_ordered_paths() logic to consider only the one > potentially-useful path not now covered by the new code in > generate_gather_paths(). > * Reverted changes in generate_distinct_paths(). I think we should > add something here but the existing logic definitely isn't right > considering the change to generate_gather_paths(). > * Assorted cosmetic cleanup in nodeGatherMerge.c. > * Documented the new GUC enable_gathermerge. > * Improved comments. Dropped one that seemed unnecessary. > * Fixed parts of the patch to be more pgindent-clean. > > Thanks Robert for hacking into this. > Testing this against the TPC-H queries at 10GB with > max_parallel_workers_per_gather = 4, seq_page_cost = 0.1, > random_page_cost = 0.1, work_mem = 64MB initially produced somewhat > demoralizing results. Only Q17, Q4, and Q8 picked Gather Merge, and > of those only Q17 got faster. Investigating this led to me realizing > that join costing for parallel joins is all messed up: see > https://www.postgresql.org/message-id/CA+TgmoYt2pyk2CTyvYCtFySXN= > jsorgh8_mjttlowu5qkjo...@mail.gmail.com > > With that patch applied, in my testing, Gather Merge got picked for > Q3, Q4, Q5, Q6, Q7, Q8, Q10, and Q17, but a lot of those queries get a > little slower instead of a little faster. Here are the timings -- > these are with EXPLAIN ANALYZE, so take them with a grain of salt -- > first number is without Gather Merge, second is with Gather Merge: > > Q3 16943.938 ms -> 18645.957 ms > Q4 3155.350 ms -> 4179.431 ms > Q5 13611.484 ms -> 13831.946 ms > Q6 9264.942 ms -> 8734.899 ms > Q7 9759.026 ms -> 10007.307 ms > Q8 2473.899 ms -> 2459.225 ms > Q10 13814.950 ms -> 12255.618 ms > Q17 49552.298 ms -> 46633.632 ms > > This is strange, I will re-run the test again and post the results soon. > I haven't really had time to dig into these results yet, so I'm not > sure how "real" these numbers are and how much is run-to-run jitter, > EXPLAIN ANALYZE distortion, or whatever. I think this overall concept > is good, because there should be cases where it's substantially > cheaper to preserve the order while gathering tuples from workers than > to re-sort afterwards. But this particular set of results is a bit > lackluster. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > -- Rushabh Lathia