Hi.

On Fri, 16 Mar 2018 23:12:38 +0530, Gimhana Nadeeshan wrote:
Hi devs,

Sorry for the delayed reply due to my academics.


If you want to start playing with the code, we could just begin
by having discussions here (on design) and on JIRA (for processing
minor issues) based on the current state of your repository.
[What's the link to look it up?]


Should I create my own repo and start code in there?[Not in the forked repo]

What's the difference?  IOW, someone else should answer. :-}

Actually it will be more helpful to me if someone [ @Gilles or @Eric ] can
guide me more. Like, to give me some minor issues in the current
implementation to solve or as a new feature implementation and gradually we
can go for deeper

IMO, the top priority would be to release "Commons Numbers":
  http://commons.apache.org/proper/commons-numbers/

There are some blocking issues on JIRA:
  https://issues.apache.org/jira/projects/NUMBERS

and eventually I can go further my my own way.  Then I
can gradually familiar with the code and I think it is the most efficient
way to learn the design architecture.[I spent hours to understand the
current code basis and I felt that was not so efficient as I thought]

Refactoring the package "stat" is not straightforward...
However, to get to that, it would be useful to record your thoughts
as you browse through the code(s): what seems easy to port, what should
be changed/fixed, what you don't understand, and so on.


And if there is a format of Proposal regarding ASF ?

I don't think so.  This ML is the forum where project directions
are discussed.

If not what should I
mention in the proposal basically?

This can be a work in progress, I think (see above suggestions).

Best regards,
Gilles


Best Regards,




On 14 March 2018 at 19:07, Gilles <gil...@harfang.homelinux.org> wrote:

Hi.

On Tue, 13 Mar 2018 23:37:24 +0530, Gimhana Nadeeshan wrote:

Hello Devs,

Thanks Gilles and Eric for guidance.

I have cloned the Commons repos and forked the Common's Stat repo. Is it
possible to make pull requests to that repo to be reviewed?


That's certainly possible, but I'm afraid that it will become
quite unwieldy from my side if I have to delete/create branches
for every PR.

If you want to start playing with the code, we could just begin
by having discussions here (on design) and on JIRA (for processing
minor issues) based on the current state of your repository.
[What's the link to look it up?]

Or should I
follow a specific method?


I'll inquire about a more efficient method (than the above)...

By referring the API docs I got some idea of the separation of modules.

In the current Commons's stat repo there are some classes under the
package distribution. I think those can be refactored using java 8 in
build statistics functionalities. Please correct me if I wrong.


An example perhaps?

As Eric said separation of function and streaming implementations is good
idea as designing. (In my point of view, it means method overloading ->
Again correct me if I didn't understand your fact correctly)


?

And I will share my draft proposal here for your review soon.


OK.

Thanks again for your interest,
Gilles



Best Regards.

On 13 March 2018 at 20:50, Gilles <gil...@harfang.homelinux.org> wrote:

Hello.

On Tue, 13 Mar 2018 09:25:19 +0100, Eric Barnhill wrote:

On Tue, Mar 13, 2018 at 12:47 AM, Gilles <gil...@harfang.homelinux.org>
wrote:



Where can we find the old code before port into new Commons components?


The code bases are managed by the "git" software; the whole history is
available:
https://git1-us-west.apache.org/repos/asf?p=commons-math.git;a=log

[I'd advise to "clone" the repositories on your local computer, and
use the command line tools.]



I believe you will want to clone the commons-math repositories, but then develop your own "fork" of the commons-statistics repository. Gilles can
correct me if that is wrong.


Actually, I know only my workflow:
 $ git clone ...
 $ git branch ...
 $ git commit ...
 $ git push

:-}

I didn't find it very easy to cooperate with developers who
fork on GitHub and submit PRs.
I've now found the "git" command that creates a branch from
a PR, but it would be so much more comfortable to just switch
directory and do "git pull".

In the context of GSoC, would it be possible to grant some
privilege to non-committers so that they can update a selected
"git" repository?
If not, what is the next easiest way to share a "common space"
(aka "sandbox") from which it would be easy to copy reviewed
bits over to the official source repository?


As

you mentioned it will be a good approach to redesign process.


You don't necessarily need to analyze how the code was before
the port/refactoring; looking at how it is now is sufficient,
unless you suspect that something is wrong now and might have
been better before. ;-)


In particular, the statistics library was designed before Java 8. Java
8
however has provided both efficient programming strategies for these statistical methods (in the form of lambdas and streams) as well as some built-in methods providing summary statistics functions (see discussion
at
http://markmail.org/message/7t2mjaprsuvb3waj).


Very good point, indeed.
IMO, the new component should be targeted Java 8.
Even Java 9 (enforcing modularity with JPMS): if by the time we think of releasing the code, we still want to avoid "multi-release" JARs it will be easy to just remove the "module-info" files (I don't think much
else Java 9 specific would used by "Commons Statistics").

In fact, given the very slow pace at which new components are being brought to releasable state, I'd like to ask whether it would be OK
to make "incremental" releases?  That would mean: focus on (maven)
modules that seem close to feature-complete and bug-free, fix the
remaining issues and perform a release with that module added.

It seems that the expectations were set to high (content-wise given
the amount of human resources), so that neither CM can be released
(too many non-fixed issues) nor its "Commons Numbers" spin-off that contains many modules, some of which are blocked by lack of consensus
or dangling discussions.

It probably makes sense, as a design strategy, to separate the function

implementation from the streaming implementation. For example, a 2D
integer
array will probably require a different streaming implementation than a
1D
double array, but they can probably both be passed the same function
handle to collect, say, the mean or max value.

The role of commons might then be to provide a convenient interface, so that the user can simply call a static method like SummaryStats.mean()
and
not have to worry about the implementation.

The other difficulty I see, is that quantile and median statistics will
not
be as easy to stream as statistics with a closed-form solution like mean
or
variance. There may however be great algorithms out there for pulling
the
median or the 95% quantile out of a stream -- if so they should be used.

Eric


Eric,

Would you be the official "mentor" for the GSoC participants that
are interested in helping with the porting of "o.a.c.math4.stat"?

Thank you,
Gilles



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to