Hi Gabor,
My quick question would be that taking into account that the order of the
> items provided to datasketches:hll_sketch is not deterministic is it normal
> behaviour that for the same dataset I get a different estimate each time I
> run my query?
> I'm trying to figure out if this is due t
Hey,
I'm an Apache Impala (distributed, fast, SQL query engine on big data)
contributor and recently started working on pulling in HLL sketching from
DataSketches. I managed to put a PoC together where Impala runs a
count(distinct) estimate on a column of a table where in the background it
uses Da