On Wed, Feb 13, 2013 at 5:55 PM, Alexander Korotkov <aekorot...@gmail.com>wrote:
> On Wed, Feb 13, 2013 at 5:28 PM, Heikki Linnakangas < > hlinnakan...@vmware.com> wrote: > >> On 04.01.2013 10:42, Alexander Korotkov wrote: >> >>> /* >>> * Calculate selectivity of "&&" operator using histograms of range >>> lower bounds >>> * and histogram of range lengths. >>> */ >>> static double >>> calc_hist_selectivity_overlap(**TypeCacheEntry *typcache, RangeBound >>> *lower, >>> RangeBound *upper, RangeBound >>> *hist_lower, int hist_nvalues, >>> Datum >>> *length_hist_values, int length_hist_nvalues) >>> >> >> We already have code to estimate &&, based on the lower and upper bound >> histograms: >> >> case OID_RANGE_OVERLAP_OP: >>> case OID_RANGE_CONTAINS_ELEM_OP: >>> /* >>> * A && B <=> NOT (A << B OR A >> B). >>> * >>> * "range @> elem" is equivalent to "range && >>> [elem,elem]". The >>> * caller already constructed the singular range >>> from the element >>> * constant, so just treat it the same as &&. >>> */ >>> hist_selec = >>> calc_hist_selectivity_scalar(**typcache, >>> &const_lower, hist_upper, >>> >>> nhist, false); >>> hist_selec += >>> (1.0 - >>> calc_hist_selectivity_scalar(**typcache, >>> &const_upper, hist_lower, >>> >>> nhist, true)); >>> hist_selec = 1.0 - hist_selec; >>> break; >>> >> >> I don't think the method based on lower bound and length histograms is >> any better. In fact, my gut feeling is that it's less accurate. I'd suggest >> dropping that part of the patch. >> > > Right. This estimation has an accuracy of histogram, while estimation > based on lower bound and length histograms rely on additional assumption > about independence of lower bound and length histogram. We can sum A << B > and A >> B probabilities because they are mutually exclusive. It's pretty > evident but I would like to mention it in the comments, because typical > assumption about events in statistics calculation is their independence. > These changes were made in attached patch. ------ With best regards, Alexander Korotkov.
range_stat-0.11.patch.gz
Description: GNU Zip compressed data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers