Guys,

First of all, great job on the contribution!

I took a brief look at the source of ML lib and have a comment regarding
the SparseDistributedMatrixStorage. It looks like it will create a new
cache for every new matrix. This sounds a little bit excessive to me,
because (at least in my understanding) new matrix creation will be a
typical operation for many of the ML algorithms, so this should be a fast
operation. Cache creation, on the other hand, requires a discovery ring
message and then full exchange cycle, which is not too fast. Moreover, when
a new cache is being created, all cache operations on other caches are also
blocked (this is because we have cross cache transactions and the need to
properly synchronize updates and rebalancing).

I would suggest creating a cache per some more long-living entity and
modify the storage so that multiple matrices are stored in the same cache.
Ignite data structures (Set, Queue, etc) are implemented exactly the same
way. In this case, you will have a very fast matrix creation which will not
block other operations.

--AG

2017-04-26 7:35 GMT+03:00 Denis Magda <dma...@apache.org>:

> Yury,
>
> Thanks for driving this. From my side I would suggest looking at Spark
> MLlib borrowing the most frequently used algorithms from there. You've
> already mentioned regression and clustering algorithms, however, it’s
> reasonable to support classifications and decision trees.
> http://spark.apache.org/docs/latest/ml-guide.html <
> http://spark.apache.org/docs/latest/ml-guide.html>
>
> Next, according to my observations Ignite ML Lib has to support Ruby and
> Python if we wish the lib to be used by researches and scientists.
>
> Finally, we have to find a better way on how to integrate Java 8 based
> Ignite ML with the rest of the platform. Presently, it’s a pain for Ignite
> build and release processes to treat Ignite ML differently. I propose to
> make up a solution by the time of Apache Ignite 2.1.
>
> —
> Denis
>
> > On Apr 21, 2017, at 9:43 AM, Yury Babak <y.ch...@gmail.com> wrote:
> >
> > Guys,
> >
> > Since the first version of Ignite ML module was merged into ignite 2.0 we
> > want to discuss our next steps.
> >
> > Currently we think about 3 big areas to explore:
> >
> > 1) Regression and clustering algorithms.
> > 2) Deep Learning/Neural Networks stuff.
> > 3) DSL/scripting support.
> >
> > Suggestions/thoughts about these topics (or something else which you
> think
> > we have missed) are welcome here as well as in IGNITE-5029.
> >
> > Some details about above topics.
> >
> > * First draft of ordinary least squares linear regression is in progress
> (by
> > Artem, IGNITE-5012).
> > * Deep learning/other NN stuff: currently Artem is investigating existing
> > frameworks like Tensorflow/Encog/etc to find out if we can integrate with
> > them somehow or at least define the scope/ideas for API of DL/NN
> > functionality we need.
> > * Also we think about using Java 8 Nashorn as script engine and
> possibility
> > of build R-like DSL (mostly by me).
> >
> > Thanks,
> > Yury Babak.
> >
> >
> >
> >
> >
> > --
> > View this message in context: http://apache-ignite-
> developers.2346864.n4.nabble.com/Ignite-ML-next-steps-
> IGNITE-5029-tp17096.html
> > Sent from the Apache Ignite Developers mailing list archive at
> Nabble.com.
>
>

Reply via email to