On Mon, May 16, 2022 at 12:09:41AM +0200, Tomas Vondra wrote: > I think it's an interesting idea. In principle it allows deducing the > multi-column MCV for arbitrary combination of columns, not determined in > advance. We'd have the MCV with HLL instead of frequencies for columns > A, B and C: > > (a1, hll(a1)) > (a2, hll(a2)) > (...) > (aK, hll(aK)) > > > (b1, hll(b1)) > (b2, hll(b2)) > (...) > (bL, hll(bL)) > > (c1, hll(c1)) > (c2, hll(c2)) > (...) > (cM, hll(cM)) > > and from this we'd be able to build MCV for any combination of those > three columns.
Sorry, but I am lost here. I read about HLL here: https://towardsdatascience.com/hyperloglog-a-simple-but-powerful-algorithm-for-data-scientists-aed50fe47869 However, I don't see how they can be combined for multiple columns. Above, I know A,B,C are columns, but what is a1, a2, etc? -- Bruce Momjian <br...@momjian.us> https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson