Re: [math] speeding up percentile based statistics

Luc Maisonobe Thu, 23 Sep 2010 11:58:19 -0700

Hi,

Resuming this thread, as I will be soon able to work on this (I hope).
I propose to start by solving
<https://issues.apache.org/jira/browse/MATH-417> using a simple
partition-based selection algorithm and preserving pivot information.
This would help both the single call use-case (we don't sort everything)
and the multiple call case (we don't redo on subsequent calls work
already done in the first calls.


Partition-based algorithms are O(n) in expected time but not in worst
case time. If end users ask for it later, we could improve again by
adding the Musser's introselect that guarantees O(n) behavior even in
worst case.

Another possibility would be to have an additional method that would
compute percentiles for several thresholds at once, but this would
require returning an array of results and I think it would be difficult
to mix with other statistics (mainly the DescriptiveStatistics.apply
method, which is used by the users I talked to).

Would this meet everyone concerns ?
Luc


Le 18/09/2010 21:50, Dimitri Pourbaix a écrit :
> Ted,
> 
>> O(n), not n
>>
>> Expected case is n + n/2 + n/4 ... < 2n
> 
> Yes, that I am already more incline to buy.
> 
> Dim.
> ----------------------------------------------------------------------------
> 
> Dimitri Pourbaix                         *
> Institut d'Astronomie et d'Astrophysique *      Don't worry, be happy
> CP 226, office 2.N4.211, building NO     *         and CARPE DIEM.
> Universite Libre de Bruxelles            *
> Boulevard du Triomphe                    *      Tel : +32-2-650.35.71
>  B-1050 Bruxelles                        *      Fax : +32-2-650.42.26
> http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourb...@astro.ulb.ac.be
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [math] speeding up percentile based statistics

Reply via email to