Re: [E] Re: How to get 'sum' for update_theta_sketch on DataSketches Python API

2023-01-17 Thread Jon Malkin
Wow, I just realized I misstated things rather significantly. We very much have tuple sketches in C++, but the python wrapper for them is a work in progress. I thought I had it ready, but it turns out there are some pretty significant limitations with the wrapper we're using (pybind11) that I now

Re: [E] Re: How to get 'sum' for update_theta_sketch on DataSketches Python API

2023-01-17 Thread Alexander Saydakov via users
Yes, Druid does this on top of the specialized Tuple sketch called ArrayOfDoublesSketch (in Java). Each key in the sketch has an array of floating-point values associated with it. PostAggregator functions can convert these columns into means and variances using org.apache.commons.math3.stat.descrip

Re: How to get 'sum' for update_theta_sketch on DataSketches Python API

2023-01-16 Thread Tomer B
Thanks yeah ! (tuple sketch and not theta as you said!). I have another question please I looked at the tuple sketch I looked at: https://datasketches.apache.org/api/java/snapshot/apidocs/org/apache/datasketches/tuple/aninteger/IntegerSummary.Mode.html and I see possible values of mode are: Sum, Mi

Re: How to get 'sum' for update_theta_sketch on DataSketches Python API

2022-12-31 Thread Jon Malkin
I believe you're looking at the tuple sketch code in java, not theta sketch. We don't yet have tuple support in C++ (on which python is based). It's planned, but I haven't yet had time to sit down and figure out how to do it -- and specifically how to do so with a reasonable API. jon

How to get 'sum' for update_theta_sketch on DataSketches Python API

2022-12-30 Thread Tomer B
In python API I can do 'update_theta_sketch' and then get_estimate to get the unique count. But how can I get the sum in python for update_theta sketch? I see it's available in non python dataksetch here: public static enum Mode --> /** * The aggregation mode is the summation function.