Hi all, The new implementation of Percentile improving performances has been checked in. It should solve issue <https://issues.apache.org/jira/browse/MATH-417>.
As discussed here, the implementation is based on a selection algorithm instead of a complete sort. A setData method has also been added to the AbstractUnivariateStatistic base class to allow caching some work between calls when several percentiles are requested. Both the partitioned data array and the first levels of pivots are cached. For now, I have set the limit to 10 levels of pivots, but this can be changed (the memory cost is a (2^n)-1 integer array for n levels). >From the few performances tests I have done, the improvements are really huge and depend on both the array size and the number of percentiles desired. Compared with a single cached sort followed by direct array accesses, the selection-based implementation is still about twice faster up to a few hundreds percentiles for array sizes of a few millions elements. For a single percentile, I saw up to a few hundreds times faster computation. Leanne, could you check this suits your needs ? Luc --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org