Thanks! We're investigating. We'll let you know if we have further questions.
jon On Thu, Aug 13, 2020, 11:40 PM Marko Mušnjak <marko.musn...@gmail.com> wrote: > Hi Jon, > The first sketch is the one where I see the jump. The exact count without > the first sketch is 24765. > > The result for lgK=12 without the first sketch is 11% off, lgK=5 is within > 2%. > > Thanks, > Marko > > On Fri, 14 Aug 2020 at 00:24, Jon Malkin <jon.mal...@gmail.com> wrote: > >> Hi Marko, >> >> Could you please let us know two more things: >> 1) Which is the one particular sketch that causes the estimate to jump? >> 2) What is the exact unique count of the others without that sketch? >> >> It sort of seems like the first sketch, but it's hard to know for sure >> since we don't know the true leave-one-out exact counts. >> >> Thanks, >> jon >> >> On Thu, Aug 13, 2020 at 8:41 AM Marko Mušnjak <marko.musn...@gmail.com> >> wrote: >> >>> Hi, >>> >>> Could someone help me understand a behavior I see when trying to union >>> some HLL sketches? >>> >>> I have 14 HLL sketches, and I know the exact unique counts for each of >>> them. All the individual sketches give estimates within 2% of the exact >>> counts. >>> >>> When I try to create a union, using the default lgMaxK parameter results >>> in total estimate that is way off (25% larger then exact count). >>> >>> However, reducing the lgMaxK parameter in the union to 4 or 5 gives >>> results that are within 2.5% of the exact counts. >>> >>> Also, one particular sketch seems to cause the final estimate to jump - >>> not adding that sketch to the union keeps the result close to the exact >>> count. >>> >>> Am I just seeing a very bad random error, or is there anything I'm doing >>> wrong with the unions? >>> >>> Running on Java, using version 1.3.0. Just in case, the sketches are in >>> the linked gist (hex encoded, one per line): >>> https://gist.github.com/mmusnjak/c00a72b3dfbc52e780c2980acfd98351 >>> and the exact counts: >>> https://gist.github.com/mmusnjak/dcbff67101be6cfc28ba01e63e41f73c >>> >>> Thank you! >>> Marko Musnjak >>> >>>