On Mon, Aug 9, 2021 at 11:05 AM Bruce Momjian <br...@momjian.us> wrote:

>
> >         selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/
> >         num_distinct1,
> >         1/num_distinct2)
> >                     = (1 - 0) * (1 - 0) / max(10000, 10000)
> >                     = 0.0001
>
> Nice, can you provide a patch please?
>
>
Change the line:

selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1,
1/num_distinct2)

to be:

selectivity = (1 - null_frac1) * (1 - null_frac2) / max(num_distinct1,
num_distinct2)

The wording already talks about "divide by max".

Though:

"so we use an algorithm that relies only on the number of distinct values
for both relations together with their null fractions:"

maybe adds a parenthetical note:

"so we use an algorithm that relies only on the number of distinct values
(the row count estimate for the whole table, not the -1 in the column
statistics) for both relations together with their null fractions:"

Just note I haven't tried to absorb that whole page, let alone the
implementation, and am not all that familiar with this part of PostgreSQL.
Its seems right, though, in isolation.

David J.

Reply via email to