Re: Show a human-readable n_distinct in pg_stats view

Laurenz Albe Fri, 15 Mar 2019 03:12:10 -0700

Maxence Ahlouche wrote:
> It seems to me that since the pg_stats view is supposed to be
> human-readable, it would make sense to show a human-readable version
> of n_distinct.
> Currently, when the stats collector estimates that the number of
> distinct values is more than 10% of the total row count, what is
> stored in pg_statistic.stadistinct is -1 * n_distinct / totalrows, the
> rationale being that if new rows are inserted in the table, they are
> likely to introduce new values, and storing that value allows the
> stadistinct not to get stale too fast.
> 
> You can find attached a simple WIP patch to show the proper n_distinct
> value in pg_stats.
> 
> * Is this desired?
> * Would it make sense to add a column in the pg_stats view to display
> the information "lost", that is the fact that postgres will assume
> that inserting new rows means a higher n_distinct?
> * Am I right to assume that totalrows in the code
> (src/backend/commands/analyze.c:2170) actually corresponds to
> n_live_tup? That's what I gathered from glancing at the code, but I
> might be wrong.
> * Should the catalog version be changed for this kind of change?
> * Should I add this patch to the commitfest?
> 
> If this patch is actually desired, I'll update the documentation as well.
> I'm guessing this patch would break scripts relying on the pg_stats
> view, but I do not know how much we want to avoid that, since they
> should rely on the base tables rather than on the views.


This may make things easier for those who are confused by a negative
entry, but it will obfuscate matters for those who are not.

I don't think that is a win, particularly since the semantics are
explained in great detail in the documentation of "pg_stats".

So I am -1 on that one.

Yours,
Laurenz Albe

Re: Show a human-readable n_distinct in pg_stats view

Reply via email to