Sorry, if you presample your data all bets are off in terms of accuracy.

On Wed, Nov 18, 2020 at 10:55 AM Sergio Castro <sergio...@gmail.com> wrote:

> Hi, I am new to DataSketches.
>
>  I know Datasketches provides an *approximate* calculation of statistics
> with *mathematically proven error bounds*.
>
> My question is:
> Say that I am constrained to take a sampling of the original data set
> before handling it to Datasketches (for example, I cannot take more than
> 10.000 random rows from a table).
> What would be the consequence of this previous sampling in the
> "mathematically proven error bounds" of the Datasketches statistics,
> relative to the original data set?
>
> Best,
>
> Sergio
>

Reply via email to