On 25 March 2015 at 01:11, Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp> wrote:
> Hi, thanks for the new patch. > > I made an additional shrink from your last one. Do you have a > look on the attached? > > Thanks, for looking again. I'm not too sure I like the idea of relying on join removals to mark the is_unique_join property. By explicitly doing it in mark_unique_joins we have the nice side effect of not having to re-check a relations unique properties after removing another relation, providing the relation was already marked as unique on the first pass. > At Sun, 22 Mar 2015 19:42:21 +1300, David Rowley <dgrowle...@gmail.com> > wrote in < > caaphdvrkwmmtwkxfn4uazyza9jql1c7uwbjbtuwfr69rqlv...@mail.gmail.com> > > On 20 March 2015 at 21:11, David Rowley <dgrowle...@gmail.com> wrote: > > > > > > I can continue working on your patch if you like? Or are you planning > to > > > go further with it? > > It's fine that you continue to work on this. > > # Sorry for the hardly baked patch which had left many things alone:( > > > I've been working on this more over the weekend and I've re-factored > things > > to allow LEFT JOINs to be properly marked as unique. > > I've also made changes to re-add support for detecting the uniqueness of > > sub-queries. > > I don't see the point of calling mark_unique_joins for every > iteration on join_info_list in remove_useless_joins. > > I've fixed this in the attached. I must have forgotten to put the test for LEFT JOINs here as I was still thinking that I might make a change to the code that converts unique semi joins to inner joins so that it just checks is_unique_join instead of calling relation_has_unique_index_for(). > The loop already iteraltes on whole the join_info_list so > mark_unique_join as an individual function is needless. > > Finally, simply marking uniqueness of join in join_is_removable > seems to be enough, inhibiting early bailing out by the checks on > attribute usage and placeholder let it work as expected. > > Reducing changes to this extent, I can easily see what is added > to planning computations. It consists of mainly two factors. > > - Unique-join chekcs for every candidate inner joins in > add_paths_to_joinrel. > > - Uniqueness check of mergejoinable clause in join-removability > check for every left join, some of which would be skipped > by other checks before. > > > > Also, I've added modified the costing for hash and nested loop joins to > > reduce the cost for unique inner joins to cost the join the same as it > does > > for SEMI joins. This has tipped the scales on a few plans in the > regression > > tests. > > I've forgotten it, but quite important. > > I've fixed quite a fundamental bug in my previous costing change. The fix for this makes it so I have to pass the unique_inner bool down to the costing functions. This also changes how the JoinPaths are marked as unique_inner. This is now done when the JoinPath is created, instead of updating it afterwards like it was done previously. > Also, please see attached unijoin_analysis.patch. This just adds some code > > which spouts out notices when join nodes are initialised which states if > > the join is unique or not. Running the regression tests with this patch > in > > places gives: > > > > Unique Inner: Yes == 753 hits > > Unique Inner: No == 1430 hits > > > > So it seems we can increase the speed of about 1 third of joins by about > > 10%. > > A quick scan of the "No"s seems to show quite a few cases which do not > look > > that real world like. e.g cartesian join. > > I don't have an idea how many queries in the reality hit this but > I suppose it's not a few. > > > It would be great if someone could run some PostgreSQL application with > > these 2 patches applied, and then grep the logs for the Unique Inner > > results... Just to get a better idea of how many joins in a real world > case > > will benefit from this patch. > > Wow. I think the second patch should be DEBUGx, not NOTICE:) > > I didn't give that much thought, but you're probably right. It was just a quick test and demo to try to raise some more interest in this. :-) Regards David Rowley
unijoin_2015-03-28_d1923fb.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers