On 2/10/25 16:56, Tomas Vondra wrote:
On 2/10/25 10:09, Andrei Lepikhov wrote:
On 8/2/2025 20:50, Tomas Vondra wrote:
<StatisticsInfo LastUpdate="2024-09-09T21:55:04.43"
ModificationCount="0" SamplingPercent="17.9892"
Statistics="[_WA_Sys_00000001_0A55DF1D]" Table="[_InfoRgX]"
Schema="[dbo]" Database="[DB]"></StatisticsInfo>

The main profit here - you see all the stats involved in estimations
(and their state), even if final plan doesn't contain estimated stuff at
all.

OK, that seems very underwhelming. I still think we should show which
clauses were estimated using which statistics object.
To understand how it may work, I employed the EXPLAIN extensibility introduced in PG 18 to show the use of plain statistics [1]. It looks like the following:

EXPLAIN (COSTS OFF, STAT ON)
SELECT * FROM sc_a WHERE x=1 AND y LIKE 'a';

 Seq Scan on sc_a
   Filter: ((y ~~ 'a'::text) AND (x = 1))
 Statistics:
   "sc_a.y: 1 times, stats: { MCV: 10 values, Correlation,
                ndistinct: 10.0000, nullfrac: 0.0000, width: 5 }
   "sc_a.x: 1 times, stats: { Histogram: 0 values, Correlation,
                 ndistinct: -1.0000, nullfrac: 0.0000, width: 4 }

As you can see, stat usage is summarised at the end of the EXPLAIN. It contains information about the column, how many times it was used and the parameters of statistic slots. Of course, being an extension it is constrained a lot, but even there is the profit:
1. You may see types of statistics exist on the column
2. Precision of the histogram or MCV (statistic_target) on a specific table - some users forget to increase it on large (or partitioned) tables 3. You have basic stat like nullfrac, ndistinct without the necessity to teach personnel how to gather it on a production instance safely.

Also, using it in real cases, I realised that it would be highly profitable to specify which statistic type was used to estimate this specific clause.

Of course, extended statistics have their own specifics, which may require another output format. Just consider this example too.

[1] https://github.com/danolivo/pg_index_stats

--
regards, Andrei Lepikhov


Reply via email to