Hi, I am new to DataSketches.

 I know Datasketches provides an *approximate* calculation of statistics
with *mathematically proven error bounds*.

My question is:
Say that I am constrained to take a sampling of the original data set
before handling it to Datasketches (for example, I cannot take more than
10.000 random rows from a table).
What would be the consequence of this previous sampling in the
"mathematically proven error bounds" of the Datasketches statistics,
relative to the original data set?

Best,

Sergio

Reply via email to