On Tue, Mar 13, 2018 at 12:47 AM, Gilles <gil...@harfang.homelinux.org> wrote:
> >> >> Where can we find the old code before port into new Commons components? >> > > The code bases are managed by the "git" software; the whole history is > available: > https://git1-us-west.apache.org/repos/asf?p=commons-math.git;a=log > > [I'd advise to "clone" the repositories on your local computer, and > use the command line tools.] I believe you will want to clone the commons-math repositories, but then develop your own "fork" of the commons-statistics repository. Gilles can correct me if that is wrong. > > > As >> you mentioned it will be a good approach to redesign process. >> > > You don't necessarily need to analyze how the code was before > the port/refactoring; looking at how it is now is sufficient, > unless you suspect that something is wrong now and might have > been better before. ;-) > In particular, the statistics library was designed before Java 8. Java 8 however has provided both efficient programming strategies for these statistical methods (in the form of lambdas and streams) as well as some built-in methods providing summary statistics functions (see discussion at http://markmail.org/message/7t2mjaprsuvb3waj). It probably makes sense, as a design strategy, to separate the function implementation from the streaming implementation. For example, a 2D integer array will probably require a different streaming implementation than a 1D double array, but they can probably both be passed the same function handle to collect, say, the mean or max value. The role of commons might then be to provide a convenient interface, so that the user can simply call a static method like SummaryStats.mean() and not have to worry about the implementation. The other difficulty I see, is that quantile and median statistics will not be as easy to stream as statistics with a closed-form solution like mean or variance. There may however be great algorithms out there for pulling the median or the 95% quantile out of a stream -- if so they should be used. Eric