Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-18 Thread Tom Lane
David Geier writes: > On 11/18/22 14:00, Tomas Vondra wrote: >> Seems fine. I wonder if we could/could introduce a new constant for 0, >> similar to ATTSTATSSLOT_NUMBERS/ATTSTATSSLOT_VALUES, instead of using a >> magic constant. Say, ATTSTATSSLOT_NONE or ATTSTATSSLOT_CHECK. > Good idea. I called

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-18 Thread David Geier
On 11/18/22 14:00, Tomas Vondra wrote: Seems fine. I wonder if we could/could introduce a new constant for 0, similar to ATTSTATSSLOT_NUMBERS/ATTSTATSSLOT_VALUES, instead of using a magic constant. Say, ATTSTATSSLOT_NONE or ATTSTATSSLOT_CHECK. Good idea. I called it ATTSTATSSLOT_EXISTS. New patc

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-18 Thread Tom Lane
Tomas Vondra writes: > On 11/18/22 09:54, David Geier wrote: >> I couldn't come up with any reasonable way of writing an automated test >> for that. > I don't think you can write a test for this, because there is no change > to behavior that can be observed by the user. Yeah, and the delta in pe

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-18 Thread Tomas Vondra
On 11/18/22 09:54, David Geier wrote: > Thanks everyone for the great feedback and suggestions. > >> >>> Yes, it is.  Using zero flag would short-cut get_attstatsslot() to just >>> return whether the slot type exists without loading it.  Do you think we >>> need to emphasize this use case in the c

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-18 Thread David Geier
Thanks everyone for the great feedback and suggestions. Yes, it is. Using zero flag would short-cut get_attstatsslot() to just return whether the slot type exists without loading it. Do you think we need to emphasize this use case in the comments for 'flags'? Perhaps, it's not really obviou

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-17 Thread Tom Lane
Richard Guo writes: > Yes, it is. Using zero flag would short-cut get_attstatsslot() to just > return whether the slot type exists without loading it. Do you think we > need to emphasize this use case in the comments for 'flags'? Perhaps, it's not really obvious now. > I wonder whether we need

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-17 Thread Richard Guo
On Fri, Nov 18, 2022 at 9:36 AM Tom Lane wrote: > Actually, looking at get_attstatslot, I realize it was already designed > to do that -- just pass zero for flags. So we could do it as attached. Yes, it is. Using zero flag would short-cut get_attstatsslot() to just return whether the slot typ

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-17 Thread Tom Lane
I wrote: > Tomas Vondra writes: >> Or perhaps what if we have a function that quickly determines if the >> attribute has MCV, without loading it? I'd bet the expensive part of >> get_attstatslot() is the deconstruct_array(). >> We could have a function that only does the first small loop over slot

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-17 Thread Tom Lane
Tomas Vondra writes: > Or perhaps what if we have a function that quickly determines if the > attribute has MCV, without loading it? I'd bet the expensive part of > get_attstatslot() is the deconstruct_array(). > We could have a function that only does the first small loop over slots, > and return

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-17 Thread Tomas Vondra
On 11/14/22 10:19, David Geier wrote: > Hi Tom, >> There won't *be* any MCV stats for a column that ANALYZE perceives to >> be unique, so I'm not quite sure where the claimed savings comes from. > > We save if one join attribute is unique while the other isn't. In that > case stored MCV stats ar

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-14 Thread David Geier
Hi Tom, There won't *be* any MCV stats for a column that ANALYZE perceives to be unique, so I'm not quite sure where the claimed savings comes from. We save if one join attribute is unique while the other isn't. In that case stored MCV stats are read for the non-unique attribute but then never

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-11 Thread Tom Lane
David Geier writes: > eqjoinsel() can be optimized by not reading MCV stats if at least one of > the two join attributes is unique. There won't *be* any MCV stats for a column that ANALYZE perceives to be unique, so I'm not quite sure where the claimed savings comes from. > With this change we

Optimize join selectivity estimation by not reading MCV stats for unique join attributes

2022-11-11 Thread David Geier
Hi hackers, eqjoinsel() can be optimized by not reading MCV stats if at least one of the two join attributes is unique. As primary keys are implicitly unique this situation can occur frequently. For unique columns no MCV stats are stored and eqjoinsel_inner() and eqjoinsel_semi(), called from