Le dim. 25 avr. 2021 à 00:32, Paul King <paul.king.as...@gmail.com> a écrit : > > Thanks Gilles, > > I can provide the same sort of stats across a clustering example > across commons-math (KMeans) vs Apache Ignite, Apache Spark and > Rheem/Apache Wayang (incubating) if anyone would find that useful. It > would no doubt lead to similar conclusions.
There also were relatively recent discussions concerning the codes in the "o.a.c.m.ml.clustering" package.[1] If they are useful as of the old CM v3.6.1, they can very probably be improved upon in terms of flexibilty[2] and performance through (a.o. things) multi-threading (in much the same way as for GA, I guess). Best regards, Gilles [1] https://issues.apache.org/jira/browse/MATH-1515 [2] Fixes and enhancements are already in CM "master" branch. > > Cheers, Paul. > > On Sun, Apr 25, 2021 at 8:15 AM Gilles Sadowski <gillese...@gmail.com> wrote: > > > > Hello Paul. > > > > Le sam. 24 avr. 2021 à 04:42, Paul King <paul.king.as...@gmail.com> a écrit > > : > > > > > > I added some more comments relevant to if the proposed algorithm > > > belongs somewhere in the commons "math" area back in the Jira: > > > > > > https://issues.apache.org/jira/browse/MATH-1563 > > > > Thanks for a "real" user's testimony. > > > > As the ML is still the official forum for such a discussion, I'm quoting > > part of your post on JIRA: > > ---CUT--- > > For linear regression, taking just one example dataset, commons-math > > is a couple of library calls for a single 2M library and solves the > > problem in 240ms. Both Ignite and Spark involve "firing up the > > platform" and the code is more complex for simple scenarios. Spark has > > a 181M footprint across 210 jars and solves the problem in about 20s. > > Ignite has a 87M footprint across 85 jars and solves the problem in > > > 40s. But I can also find more complex scenarios which need to scale > > where Ignite and Spark really come into their own. > > ---CUT--- > > > > A similar rationale was behind my developing/using the SOFM > > functionality in the "o.a.c.m.ml.neuralnet" package: I needed a > > proof of concept, and taking the "lightweight" path seemed more > > effective than experimenting with those platforms. > > Admittingly, at that epoch, there were people around, who were > > maintaining the clustering and GA codes; hence, the prototyping > > of a machine-learning library didn't look strange to anyone. > > > > Regards, > > Gilles > > > > >>> [...] --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org