On Wed, Apr 21, 2021 at 4:12 PM Ralph Goers <ralph.go...@dslextreme.com> wrote:
>
> Why are y’all having a long discussion on Vote thread?

Fair enough. I am +1 (non-binding).

Cheers, Paul.

> > On Apr 20, 2021, at 10:33 PM, Paul King <paul.king.as...@gmail.com> wrote:
> >
> > Hi Avijit Basak,
> >
> > +1 to thanking you for your offer. Just a couple of comments from
> > someone who is only a marginal contributor to the commons project.
> >
> > I would be keen to see a new commons component incorporating various
> > machine learning/data science components. The other main contenders
> > that seem to be reasonably actively developed are Smile[1] and Weka[2]
> > which are licensed under GPL or LGPL. Such a component would be a
> > natural fit for the algorithm you propose. If you look at Apache
> > Spark[3] and Apache Ignite[4], they both offer some "machine learning"
> > offerings but they tend to only support algorithms which are either
> > "embarrassingly" parallel or inherently parallel. They tend not to
> > include sequential by nature algorithms. Even "embarrassingly"
> > parallel algorithms are often not included since they can typically
> > already be used already by Spark, Ignite, Beam, Wayang, or home-grown
> > threads/fibres.
> >
> > There has been previous research into PGA with Hadoop, Spark and
> > Ignite[5][6] but so far, none of that has made it into those
> > distributions as far as I know. I don't know how customisable the
> > Ignite GA algorithm[7] is but it might be worth looking into.
> >
> > With respect to component naming, you either go very broad with "math"
> > or something like "datascience", or potentially too narrow with
> > something like "ml" or "machinelearning". Of the latter two, "ml" is
> > most common when bundled into some other framework. The other
> > alternative is to simply come up with another name but the typical
> > convention within commons is to use a descriptive to purpose name.
> > Numerous "ml" libraries also bundle things like regression into them,
> > so there is precedence for such libraries to be algorithms broadly in
> > the topic space. On the commons math front, I think regression is
> > currently earmarked for statistics but not sure it has made the jump
> > as of yet. An "ml" home would be equally suitable in my mind.
> >
> > Having said all of that, as others have pointed out, the volunteer
> > space in commons is somewhat lean at the moment. I would be happy to
> > help a little from the ASF side of things but machine learning/data
> > science isn't my principal area of expertise nor a major aspect in my
> > "day job" activities, it probably takes others with interest to fully
> > give this the effort it deserves. But sometimes someone has to get the
> > ball rolling before other interested parties show up.
> >
> > Cheers, Paul
> >
> > [1] https://haifengl.github.io/ <https://haifengl.github.io/>
> > [2] https://www.cs.waikato.ac.nz/ml/weka/ 
> > <https://www.cs.waikato.ac.nz/ml/weka/>
> > [3] https://spark.apache.org/mllib/ <https://spark.apache.org/mllib/>
> > [4] https://ignite.apache.org/docs/latest/machine-learning/machine-learning 
> > <https://ignite.apache.org/docs/latest/machine-learning/machine-learning>
> > [5] https://hajirajabeen.github.io/publications/SGA.pdf 
> > <https://hajirajabeen.github.io/publications/SGA.pdf>
> > [6] https://dzone.com/articles/genetic-algorithms-with-apache-ignite 
> > <https://dzone.com/articles/genetic-algorithms-with-apache-ignite>
> > [7] 
> > https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/ml/util/genetic/GeneticAlgorithm.html
> >  
> > <https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/ml/util/genetic/GeneticAlgorithm.html>
> >
> > On Sun, Feb 14, 2021 at 6:06 PM Avijit Basak <avijit.ba...@gmail.com 
> > <mailto:avijit.ba...@gmail.com>> wrote:
> >>
> >> Hi
> >>
> >>       I would like to mention a few points here. Genetic Algorithm has a
> >> vast range of applications in optimization and search problems. Machine
> >> learning is only one of those.
> >>       If we couple the new GA library with any specific domain like ml it
> >> would be meaningless for people working in other domains. They have to
> >> incorporate the entire ml library which may be completely unrelated to
> >> their project. Coupling it with any technology like spark might also limit
> >> it's usability.
> >>       If a separate component is not approved for this change then we can
> >> incorporate the changes as part of *commons.math* library.
> >>       The same library can be reused in ml or neural network libraries as
> >> a dependency.
> >>       Kindly share further views on this.
> >>
> >> Thanks & Regards
> >> --Avijit Basak
> >>
> >> On Wed, 10 Feb 2021 at 19:49, Gilles Sadowski <gillese...@gmail.com 
> >> <mailto:gillese...@gmail.com>> wrote:
> >>
> >>> Le mer. 10 févr. 2021 à 13:19, sebb <seb...@gmail.com 
> >>> <mailto:seb...@gmail.com>> a écrit :
> >>>>
> >>>> Likewise, commons-ml is too cryptic.
> >>>>
> >>>> Also, the Spark project has a machine-learning library:
> >>>>
> >>>> https://spark.apache.org/mllib/ <https://spark.apache.org/mllib/>
> >>>
> >>> Thanks for the pointer.
> >>>
> >>>>
> >>>> Maybe that would be better home?
> >>>
> >>> On the face of it, probably.
> >>> [For sure, Avijit should comment on the suggestion.]
> >>>
> >>> On the other hand, "Commons" is the place where one can pick "bare
> >>> bone" implementations, and add the functionality to one's application
> >>> without necessarily comply with an overarching framework.
> >>> [I don't mean that framework compliance is bad; quite the contrary, it is
> >>> hopefully the result of a thorough reflection by experts.  But ... cf. the
> >>> numerous "no-dependency" discussions ...]
> >>>
> >>> Actually, concerning Avijit's proposed contribution, didn't I say:[1]
> >>> ---CUT---
> >>> Thus, I think that we must assess whether the "genetic algorithms"
> >>> functionality has a reasonable future within "Apache Commons" (i.e.
> >>> potential users and contributors) while there exist other libraries that
> >>> seem much more advanced for any serious usage.
> >>> ---CUT---
> >>>
> >>>> I'm also a bit concerned as to whether there are sufficient developers
> >>>> here with knowledge of the ML domain to be able to support the code in
> >>>> the future.
> >>>
> >>> An interesting point; by all means not a new one (see e.g. [2]).
> >>>
> >>> Isn't it the same point I've been making about "Commons Math" (CM)?
> >>> There has been no releases because nobody here is able (or is willing
> >>> to) support it.
> >>>
> >>> Concerning the support of the purported "machinelearning" component:
> >>> 1. Package
> >>>        org.apache.commons.math4.ml.neuralnet
> >>>    * I've written it entirely and I have applications that depend on it
> >>> (and I
> >>>      cannot assume that I could easily switch to, or port it to, Spark),
> >>> so I
> >>>      can reasonably ensure that it would be supported.
> >>> 2. Package
> >>>        org.apache.commons.math4.ml.clustering
> >>>    * Functionality is mentioned in Spark's "mllib" user guide.
> >>>    * When a new feature was last contributed[3], it was noticed[4][5][6]
> >>>      that improvement were needed (but there was no follow-up).
> >>>    * I've an application that depend on it (from CM v3.6.1) but I wouldn't
> >>>      support it if shipped in CM v4.0.
> >>> 3. Package
> >>>        org.apache.commons.math4.genetics
> >>>    * Part of my "end-of-study" project consisted in a GA implementation.
> >>>      I've never used the CM implementation, and I don't deny that there
> >>>      could be perfectly fine uses of it but, just looking at the code, it
> >>> seems
> >>>      obvious that it cannot compete feature-wise with other libraries
> >>> out there.
> >>>    * I've suggested long ago that, without anyone supporting it actively
> >>> (and
> >>>      no known user community), it should be dropped from CM.
> >>>    * Avijit expressed a willingness to improve the functionality:  Is
> >>> this enough
> >>>      for the PMC to create a new component?  From the experience with the
> >>>      "clustering" package mentioned above, I'd tend to think
> >>> (unfortunately)
> >>>      that it isn't.  He should first explore whether the Spark community
> >>> is
> >>>      interested, that the GA functionality be moved over there.
> >>>
> >>> Gilles
> >>>
> >>> [1] https://issues.apache.org/jira/browse/MATH-1563 
> >>> <https://issues.apache.org/jira/browse/MATH-1563>
> >>> [2] https://markmail.org/message/26yxj5vhysdsoety 
> >>> <https://markmail.org/message/26yxj5vhysdsoety>
> >>> [3] https://issues.apache.org/jira/projects/MATH/issues/MATH-1509 
> >>> <https://issues.apache.org/jira/projects/MATH/issues/MATH-1509>
> >>> [4] https://issues.apache.org/jira/projects/MATH/issues/MATH-1524 
> >>> <https://issues.apache.org/jira/projects/MATH/issues/MATH-1524>
> >>> [5] https://issues.apache.org/jira/projects/MATH/issues/MATH-1528 
> >>> <https://issues.apache.org/jira/projects/MATH/issues/MATH-1528>
> >>> [6] https://issues.apache.org/jira/projects/MATH/issues/MATH-1526 
> >>> <https://issues.apache.org/jira/projects/MATH/issues/MATH-1526>
> >>>
> >>>>
> >>>> On Wed, 10 Feb 2021 at 08:27, Emmanuel Bourg <ebo...@apache.org 
> >>>> <mailto:ebo...@apache.org>> wrote:
> >>>>>
> >>>>> -1 for commons-ml for the same reasons.
> >>>>>
> >>>>> What about commons-machine-learning or commons-math-learning? The
> >>> latter
> >>>>> is as long as commons-configuration.
> >>>>>
> >>>>> Emmanuel Bourg
> >>>>>
> >>>>>
> >>>>> Le 2021-02-10 03:27, Ralph Goers a écrit :
> >>>>>> -1 on commons-ml as the name. My first thought is such a repo would
> >>>>>> hold stuff related to mailing lists. Then again maybe it contains
> >>>>>> stuff relating to markup languages. Maybe it is Apache’s version of
> >>>>>> the ML Programming Language [1].
> >>>>>>
> >>>>>> However, I wouldn’t be -1 on commons-math-ml, although at best I
> >>> would
> >>>>>> be +0 since it is still not obvious what it would contain.
> >>>>>>
> >>>>>> Ralph
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>
> >>>
> >>
> >> --
> >> Avijit Basak
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org 
> > <mailto:dev-unsubscr...@commons.apache.org>
> > For additional commands, e-mail: dev-h...@commons.apache.org 
> > <mailto:dev-h...@commons.apache.org>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to