I can't speak for MLlib, too. But I can say the model of training in Hadoop M/R or Spark and production scoring in Storm works very well. My team has done online learning (Sofia ML library, I think) in Storm as well.
I would be interested in this answer as well. -Suren On Thu, Jun 19, 2014 at 7:35 AM, Eustache DIEMERT <eusta...@diemert.fr> wrote: > Well, yes VW is an appealing option but I only found "experimental" > integrations so far. > > Also, early experiments suggest Decision Trees Ensembles (RF, GBT) perform > better than generalized linear models on our data. Hence the interest for > MLLib :) > > Any other comments / suggestions welcome :) > > E/ > > > 2014-06-19 12:37 GMT+02:00 Charles Earl <charles.ce...@gmail.com>: > >> While I can't definitively speak to MLLib online learning, >> I'm sure you're evaluating Vowpal Wabbit, for which there's been some >> storm integrations contributed. >> Also you might look at factorie, http://factorie.cs.understanding.edu, >> which at least provides an online lda. >> C >> >> >> On Thursday, June 19, 2014, Eustache DIEMERT <eusta...@diemert.fr> wrote: >> >>> Hi Sparkers, >>> >>> We have a Storm cluster and looking for a decent execution engine for >>> machine learned models. What I've seen from MLLib is extremely positive, >>> but we can't just throw away our Storm based stack. >>> >>> So my question is: is it feasible/recommended to train models in >>> Spark/MLLib and execute them in another Java environment (Storm in this >>> case) ? >>> >>> Thanks for any insights :) >>> >>> Eustache >>> >> >> >> -- >> - Charles >> > > -- SUREN HIRAMAN, VP TECHNOLOGY Velos Accelerating Machine Learning 440 NINTH AVENUE, 11TH FLOOR NEW YORK, NY 10001 O: (917) 525-2466 ext. 105 F: 646.349.4063 E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io W: www.velos.io