Thanks Tom.
Since you mentioned the planner not knowing about the correlation between the
columns, I’m curious, why doesn’t making a multivariate statistic make a
difference?
CREATE STATISTICS col_a_col_b_stats (dependencies) ON col_a, col_b FROM
test_table;
ANALYZE test_table;
And the result
Thanks so much for your help, Tom.
Sorry, I didn’t quite understand the answer — I have a few follow-up questions.
Sorry, I'm new to Postgres so I am a bit ignorant here and would appreciate
any tips on the query planner you could give.
1) Why is the query currently picking the poorly perform
On Thu, 3 Apr 2025 at 18:07, Tom Lane wrote:
> A simple-minded approach could be to just be pessimistic, and
> increase our estimate of how many rows would need to be scanned as a
> consequence of noticing that the columns have significant correlation.
> The shape of that penalty function would be
David Rowley writes:
> On Thu, 3 Apr 2025 at 16:24, Manikandan Swaminathan
> wrote:
>> why doesn’t making a multivariate statistic make a difference?
> Extended statistics won't help you here. "dependencies" just estimates
> functional dependencies between the columns mentioned in the ON
> claus
On Thu, 3 Apr 2025 at 16:24, Manikandan Swaminathan
wrote:
> Since you mentioned the planner not knowing about the correlation between the
> columns, I’m curious, why doesn’t making a multivariate statistic make a
> difference?
>
>
> CREATE STATISTICS col_a_col_b_stats (dependencies) ON col_a, c
Manikandan Swaminathan writes:
> 1) Why is the query currently picking the poorly performing index?
Because the planner thinks that one will be cheaper, as you can see by
comparing the cost estimates in EXPLAIN. It's wrong, but this is a
hard problem to estimate well. Especially when the behavi
Manikandan Swaminathan writes:
> 4. When running the following query, I would expect the index "idx_col_b_a"
> to be used: select min(col_b) from test_table where col_a > 4996.
> I have a range-based filter on col_a, and am aggregating the result with
> min(col_b). Both columns are covered by "id