I wrote:
> ... So my preference is to align the two
> definitions of STATISTIC_KIND_MCELEM by adding a null-element frequency
> to tsvector's usage (where it'll always be zero) and getting rid of the
> average distinct element count here.

Actually, there's a way we can do this without code changes in the
tsvector stuff.  Since the number of MCELEM stanumber items that provide
frequencies of stavalue items is obviously equal to the length of
stavalues, we could define stanumbers as containing those matching
entries, then two min/max entries, then an *optional* entry for the
frequency of null elements (with the frequency presumed to be zero if
omitted).  This'd be non-ambiguous given access to stavalues.  I'm not
sure though if making the null frequency optional wouldn't introduce
complexity elsewhere that outweighs not having to touch the tsvector
code.

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to