On Wed, Aug 27, 2014 at 1:40 AM, Heikki Linnakangas <hlinnakan...@vmware.com > wrote:
> On 08/26/2014 03:28 PM, David Rowley wrote: > >> Any ideas or feedback on this would be welcome >> > > Before someone spends time reviewing this patch, are you sure this is > worth the effort? It seems like very narrow use case to me. I understand > removing LEFT and INNER joins, but the case for SEMI and ANTI joins seems a > lot thinner. Unnecessary LEFT and INNER joins can easily creep into a query > when views are used, for example, but I can't imagine that happening for a > SEMI or ANTI join. Maybe I'm lacking imagination. If someone has run into a > query in the wild that would benefit from this, please raise your hand. > > I agree that the use case for removals of SEMI and ANTI join are a lot thinner than LEFT and INNER joins. My longer term goal here is to add join removal support for INNER joins. In order to do this I need the foreign key infrastructure which is included in this patch. I held back from just going ahead and writing the INNER JOIN removal patch as I didn't want to waste the extra effort in doing that if someone was to find a show stopper problem with using foreign keys the way I am with this patch. I was kind of hoping someone would be able to look at this patch a bit more and confirm to me that it's safe to do this or not before I go ahead and write the inner join version. > If I understood correctly, you're planning to work on INNER join removal > too. How much of the code in this patch is also required for INNER join > removal, and how much is specific to SEMI and ANTI joins? > > Apart from the extra lines of code in remove_useless_joins(), there's 3 functions added here which won't be needed at all for INNER JOINs; semiorantijoin_is_removable(), convert_semijoin_to_isnotnull_quals() and convert_antijoin_to_isnull_quals(). Not including the regression tests, this is 396 lines with comments and 220 lines without. All of these functions are static and in analyzejoin.c. The benchmarks I posted a few weeks back show that the overhead of performing the semi/anti join removal checks is quite low. I measured an extra 400 or so nanoseconds for a successful removal on my i5 laptop. Or just 15 nanoseconds on the earliest fast path for a non-removal. This accounted for between 0.008% and 0.2% of planning time for the queries I tested. Regards David Rowley