On Mon, May 16, 2022 at 12:09:41AM +0200, Tomas Vondra wrote:
> I think it's an interesting idea. In principle it allows deducing the
> multi-column MCV for arbitrary combination of columns, not determined in
> advance. We'd have the MCV with HLL instead of frequencies for columns
> A, B and C:
> 
> (a1, hll(a1))
> (a2, hll(a2))
> (...)
> (aK, hll(aK))
> 
> 
> (b1, hll(b1))
> (b2, hll(b2))
> (...)
> (bL, hll(bL))
> 
> (c1, hll(c1))
> (c2, hll(c2))
> (...)
> (cM, hll(cM))
> 
> and from this we'd be able to build MCV for any combination of those
> three columns.

Sorry, but I am lost here.  I read about HLL here:

        
https://towardsdatascience.com/hyperloglog-a-simple-but-powerful-algorithm-for-data-scientists-aed50fe47869

However, I don't see how they can be combined for multiple columns. 
Above, I know A,B,C are columns, but what is a1, a2, etc?

-- 
  Bruce Momjian  <br...@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson



Reply via email to