JCuda: No, I'm not willing to rely on servers having NVidia cards (some one
who is more familiar with server hardware may correct me, in which case
I'll say, "No, because *my* servers don't have NVidia cards- someone else
can add").

Paralleization: Yes.Admittedly, very clever use of Python could probably be
used to solve this problem depending on how we cut it up (I anticipate
cursing myself for not going this route several times in the weeks to
come). The motivation for Flink over Python is a solution that is the hope
for a more general and reusable approach.  Neural networks in general are
solvable so long as you have some decent linear algebra backing you up.
(However, I'm also toying with the idea of additionally putting in an
evolutionary algorithm approach as an alternative to back propagation
through time)

The thought guiding this, to borrow a term from American auto racing, is
"there is no replacement for displacement" - meaning, a reasonably
functional 7 liter engine will be powerful than a performance tuned 1.6
liter engine. In this case- an OK implementation in Flink spread over lots
and lots of processors being more powerful than a local 'sport-tuned'
implementation with clever algorithms and GPUs etc, etc.  (The arguments
against evolutionary algorithms in solving neural networks normally
revolves around the concept of efficiency, however doing several
generations on each node then reporting best parameter sets to be 'bred'
then re broadcasting parameter sets is a natural fit for distributed
systems. More of an academic exercise, but interesting conceptually- I know
there are some grad students reading this who are itching for thesis
projects; Olcay Akman and I did something similar for an implementation in
R, see my github repo IRENE for a very ugly implementation)

The motivation for Flink over an alternative big-data platform (see
SPARK-2352) is A) online learning and sequences intuitively seems to be a
better fit for Flink's streaming architecture, B) I don't know much about
SparkML code base so I'd there would be an additional learning curve, C)
I'd have to spend the rest of my life looking over my shoulder to maker
sure Slim wasn't going jump out and get me (we live in the same city, the
fear is real).

tg




Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Fri, Feb 12, 2016 at 8:04 AM, Simone Robutti <
simone.robu...@radicalbit.io> wrote:

> Asking as someone that never did NN on Flink, would you implement it using
> JCuda? And would you implement it with model parallelization? Is there any
> theoretical limit to implement "model and data parallelism" in Flink? If
> you don't use GPUs and you don't parallelize models and data at the same
> time, what is your motivation to do such a thing on Flink instead of a
> local enviroment that would probably be more performant on a certain
> degree?
>
> 2016-02-12 14:58 GMT+01:00 Trevor Grant <trevor.d.gr...@gmail.com>:
>
> > Agreed. Our reasoning for for contributing straight to Flink was we plan
> on
> > doing a lot of wierd monkey-ing around with these things, and were going
> to
> > have to get our hands dirty with some code eventually anyway.  The LSTM
> > isn't *that* difficult to implement, and it seems easier to write our own
> > than to understand someone else's insanity.
> >
> > The plan is to get a 'basic' version going, then start tweaking the
> special
> > cases.  We have a use case for bi-directional, but it's not our primary
> > motivation. I have no problem exposing new flavors as we make them.
> >
> > tg
> >
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> >
> >
> > On Fri, Feb 12, 2016 at 7:51 AM, Suneel Marthi <suneel.mar...@gmail.com>
> > wrote:
> >
> > > On Fri, Feb 12, 2016 at 8:45 AM, Trevor Grant <
> trevor.d.gr...@gmail.com>
> > > wrote:
> > >
> > > > Hey all,
> > > >
> > > > I had a post a while ago about needing neural networks.  We
> > specifically
> > > > need a very special type that are good for time series/sensors called
> > > > LSTM.  We had a talk about pros/cons of using deeplearning4j for this
> > use
> > > > case and eventually decided it made more sense to implement in native
> > > Flink
> > > > for our use case.
> > > >
> > > > So, this is somewhat relevant to what Theodore just said, but
> different
> > > > enough that I wanted a separate thread.
> > > >
> > > > "Focusing on Flink does well and implement algorithms built around
> > > inherent
> > > > advantages..."
> > > >
> > > > One thing that jumps to mind is doing online learning.  The batch
> > nature
> > > of
> > > > all of the other 'big boys' means that they are by definition going
> to
> > > > always be offline modes.
> > > >
> > > > Also, even though LTSMs are somewhat of a corner case in the NN
> world,
> > > the
> > > > streaming nature of Flink (a sequence of data) makes fairly relevant
> to
> > > > people who would be using Flink in the first place (? IMHO)
> > > >
> > > > Finally, there should be some positive externalities that come from
> > this
> > > > such as a back propegation algorithm, which should then be reusable
> for
> > > > things like HMMs.
> > > >
> > > > So at any rate, the research Spike for me started earlier this week-
> I
> > > hope
> > > > to start cutting some scala code over the weekend or beginning of
> next
> > > > week. Also I'm asking to check out FLINK-2259 because I need some
> sort
> > of
> > > > functionality like that before I get started, and I could use the git
> > > > practice.
> > > >
> > > > Idk if there is any interest in adding this or if you want to make a
> > JIRA
> > > > for LTSM neural nets (or if I should write one, with appropriate
> papers
> > > > cited, as seems to be the fashion), or maybe wait and see what I end
> up
> > > > with?
> > > >
> > > > It would be good if we also supported Bidirectional LSTMs.
> > >
> > > http://www.cs.toronto.edu/~graves/asru_2013.pdf
> > >
> > > http://www.cs.toronto.edu/~graves/phd.pdf
> > >
> > >
> > >
> > >
> > > > Also- I'll probably be blowing you up with questions.
> > > >
> > > > Best,
> > > >
> > > > tg
> > > >
> > > >
> > > >
> > > > Trevor Grant
> > > > Data Scientist
> > > > https://github.com/rawkintrevo
> > > > http://stackexchange.com/users/3002022/rawkintrevo
> > > > http://trevorgrant.org
> > > >
> > > > *"Fortunate is he, who is able to know the causes of things."
> -Virgil*
> > > >
> > >
> >
>

Reply via email to