Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'

2025-02-19 Thread Alexander Korotkov
On Tue, Feb 18, 2025 at 2:52 PM Andrei Lepikhov wrote: > On 17/2/2025 02:06, Alexander Korotkov wrote: > > On Thu, Nov 28, 2024 at 4:39 AM Andrei Lepikhov wrote: > >> Here we also could count number of scanned NULLs separately in > >> vardata_extra and use it in upper GROUP-BY estimation. > > > >

Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'

2025-02-19 Thread Alexander Korotkov
Hi, Vlada. On Tue, Feb 18, 2025 at 6:56 PM Vlada Pogozhelskaya wrote: > Following the discussion on improving statistics estimation by considering > GROUP BY as a unique constraint, I’ve prepared a patch that integrates GROUP > BY into cardinality estimation in a similar way to DISTINCT. > > Th

Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'

2025-02-18 Thread Vlada Pogozhelskaya
Hi all, Following the discussion on improving statistics estimation by considering GROUP BY as a unique constraint, I’ve prepared a patch that integrates GROUP BY into cardinality estimation in a similar way to DISTINCT. This patch ensures that when a subquery contains a GROUP BY clause, the

Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'

2025-02-18 Thread Andrei Lepikhov
On 17/2/2025 02:06, Alexander Korotkov wrote: On Thu, Nov 28, 2024 at 4:39 AM Andrei Lepikhov wrote: Here we also could count number of scanned NULLs separately in vardata_extra and use it in upper GROUP-BY estimation. What could be the type of vardata_extra? And what information could it st

Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'

2025-02-16 Thread Alexander Korotkov
On Thu, Nov 28, 2024 at 4:39 AM Andrei Lepikhov wrote: > Thanks to take a look! > > On 11/25/24 23:45, Heikki Linnakangas wrote: > > On 24/09/2024 08:08, Andrei Lepikhov wrote: > >> + * proves the var is unique for this query. However, we'd better > >> still > >> + * believe the null-frac

Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'

2024-11-27 Thread Andrei Lepikhov
Thanks to take a look! On 11/25/24 23:45, Heikki Linnakangas wrote: On 24/09/2024 08:08, Andrei Lepikhov wrote: + * proves the var is unique for this query.  However, we'd better still + * believe the null-fraction statistic.   */ if (vardata->isunique) stadistinct =

Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'

2024-11-25 Thread Heikki Linnakangas
On 24/09/2024 08:08, Andrei Lepikhov wrote: On 19/9/2024 09:55, Andrei Lepikhov wrote: This wrong prediction makes things much worse if the query has more upper query blocks. His question was: Why not consider the grouping column unique in the upper query block? It could improve estimations. Af

Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'

2024-09-23 Thread Andrei Lepikhov
On 19/9/2024 09:55, Andrei Lepikhov wrote: This wrong prediction makes things much worse if the query has more upper query blocks. His question was: Why not consider the grouping column unique in the upper query block? It could improve estimations. After a thorough investigation, I discovered th