Hi Manu, I looked into it and I'm also working on the integration. When I started my first try, Flink had the problem to not properly support interfaces and subclasses. This was relevant, because the distributed row-wise partitioned matrices can be indexed by int keys or string keys. By now, this should be fixed.
Another issue is the intermediate result retrieval to the driver program. But we can work around this problem by either using the RemoteCollectorOutputFormat or using the convenience methods which will be introduced by the pending PR #210 [1]. Apart from that, I hope that Flink supports all means necessary to implement the Mahout DSL. I don't know how quickly you'll find someone to work on the Mahout DSL but considering that I'm quite familiar with the topic and already started working on it, it would make sense for me to continue working on it. But having support for the Mahout DSL one could start thinking about some mixed specialized Flink algorithms with high-level linear algebra pre- and post-processing using the DSL. Moreover, Alexander told me that you have the Impro 3 course where you implemented several ML algorithms which could be ported to the latest version of Flink. At least, that would be a good start to familiarize oneself with the system. Greets, Till [1] https://github.com/apache/flink/pull/210 [2] https://github.com/TU-Berlin-DIMA/IMPRO-3.SS14 On Sat, Jan 31, 2015 at 6:10 PM, mkaul <k...@tu-berlin.de> wrote: > Hi All, > At TUB, we are looking at the possibility of hiring a new student > programmer > and possibly another Masters student to work on integrating Flink with > Mahout DSL, to get a declarative language that can then be used to > implement > other ML algorithms. > Just wanted to know if someone has already started looking into this topic > and if there were any efforts already started in this direction? If yes, > what were the main challenges faced? Would be interesting to know. > If not, I would also be interested to hear some possible design decisions > in > order to make this work. > > Cheers, > Manu > > > > -- > View this message in context: > http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Kicking-off-the-Machine-Learning-Library-tp2995p3592.html > Sent from the Apache Flink (Incubator) Mailing List archive. mailing list > archive at Nabble.com. >