Hi all, During a recent face to face discussion with some commons-math users from a large project, it appeared one implementation choice for Percentile statistics was a huge performance bottleneck.
When the Percentile.evaluate method is called, the sample array is copied and sorted. In one use case, 100 calls were made for each sample, so the same array was sorted 100 times. In another use case, only one call is made (typically to compute the median) but the user wondered why the array should be completely sorted. In fact, using a selection algorithm instead of a sorting algorithm would be sufficient. I would like to have you opinion about providing a different evaluate method that would process an array provided previously and use only selections to provide the percentile. Consider for example the following input array: [ -6, 8, 4, -5, 0, 2, 7 ] If we first as for the median, a selection algorithm may reorganize the array as: [ -6, 0, -5, 2, 8, 4, 7 ] were the left part is smaller than the central element, the right part is larger and hence the central element 2 is known to be the median. Then we ask for another value, the 25% percentile. Since the object already knows the smaller elements are in the left part, it can use a select algorithm on the 3 elements left part and extract the value -5 without even trying to sort the upper part of the array. What do you think ? Luc --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org