Hi Alex,
You are right. I assumed that quantiles would have an intersection like
Theta Sketches but they don't.
If it did based on transaction ID, it would have been cool. So many
traditional data mining & predictive analytics algorithms could be
reimplemented with sketches.
Thanks for your insi
This seems to be a convoluted way of computing the sum of all values. This
is an additive metric, easy to compute exactly, no sketches needed.
On Sun, May 22, 2022 at 5:56 AM vijay rajan
wrote:
> Hi folks (And Lee),
>
> I think I have found what I was looking for in quantile sketches though I
>
Hi folks (And Lee),
I think I have found what I was looking for in quantile sketches though I
am not able to formulate error bounds for the same.
I should have raised a PR request but I am going to write the code here.
The code below estimates the volume of the quantile sketche based on the
exampl
Thanks Will. Please find my reply in-line below.
But just to stay in line with my original question of a sketch for additive
metrics, is that I can use such a sketch for on-the-fly aggregation by
storing one such sketch per "dimension=value" pair without having to go to
the table for aggregation.
Hi Will,
Thanks for your response. I will send my clarifications in a day or two.
Please do look at my detailed explanation & look at the datasets and
results that I have shared. You should understand what I am trying to do.
Essentially, an event_Id is a uuid for an event. A click stream will hav
OK, this is interesting. I've got some concerns and questions that I've put
inline below.
Will
Will Lauer
Senior Principal Architect, Audience & Advertising Reporting
Data Platforms & Systems Engineering
M 508 561 6427
Champaign Office
1908 S. First St
Champaign, IL 61822
On Mon, May
Thanks Lee. Please find my answers inline below in blue. I think you will
find my use case very interesting. My next endeavor would be to make a
decision tree with entropy / gini impurity measures with sketches. I am
amazed at some of the results I have gotten. You may find this quite
interesting.
Vijay,
Sorry about the delay in getting back to you.
There is some critical information missing from your description and that
is the domain of what you are sketching.
I presume that it is User-IDs, otherwise it doesn't make sense.
If this is the case I think the solution can be achieved in a coup