Re: Add a machine learning algorithm to sparkml

2017-10-20 Thread anurag . verma
Manilos, There is also scope of enhancing existing ML algorithms. In particular for Neural Net/ MLP adding more activation functions like Relu/ Tanh. Also adding functionality for deep learning architecture like CNN or LSTM which are gaining popularity. This may be more feasible in terms of l

Re: Add a machine learning algorithm to sparkml

2017-10-20 Thread Stephen Boesch
A couple of less obvious facets of getting over the (significant!) hurdle to have an algorithm accepted into mllib (/spark.ml): - the review time can be *very *long - a few to many months is a typical case even for relatively fast tracked algorithms - you will likely be asked to provide

Add a machine learning algorithm to sparkml

2017-10-20 Thread Manolis Gemeliaris
Hello everyone, I am an undergraduate student and now looking to do my final year project. Professor Minos Garofalakis suggested to me that as a project , I could find a machine learning algorithm not implemented by anyone ,in Spark.ml and implement it. As t

Re: HashingTFModel/IDFModel in Structured Streaming

2017-10-20 Thread Joseph Bradley
Hi Davis, We've started tracking these issues under this umbrella: https://issues.apache.org/jira/browse/SPARK-21926 I'm hoping we can fix some of these for 2.3. Thanks, Joseph On Mon, Oct 16, 2017 at 9:23 PM, Davis Varghese wrote: > I have built a ML pipeline model on a static twitter data for

Re: SparkR is now available on CRAN

2017-10-20 Thread Joseph Bradley
Awesome, this is a big step for Spark! On Thu, Oct 12, 2017 at 12:06 PM, Holden Karau wrote: > That's wonderful news! :) Now we have Spark in CRAN, PyPi, and maven so > the on-rap should be easy for every one. Excited to see more SparkR users > joining us :) > > On Thu, Oct 12, 2017 at 11:25 AM,

Re: Graceful node decommission mechanism for Spark

2017-10-20 Thread Juan Rodríguez Hortalá
Hi, Are there any comments or suggestions regarding this proposal? Thanks, Juan On Mon, Oct 16, 2017 at 10:27 AM, Juan Rodríguez Hortalá < juan.rodriguez.hort...@gmail.com> wrote: > Hi all, > > I have a prototype for "Keep track of nodes which are going to be shut > down & avoid scheduling ne