Hi Alexander, all, I now have uploaded the code (see links below), and look forward to learn about the outcome of your experiments! Best regards, Bert
--- https://github.com/apache/spark/pull/1290 https://issues.apache.org/jira/browse/SPARK-2352 > -----Original Message----- > From: Ulanov, Alexander [mailto:alexander.ula...@hp.com] > Sent: 01 July 2014 18:17 > To: dev@spark.apache.org > Subject: RE: Artificial Neural Network in Spark? > > Hi Bert, > > There is a specific process of pull request if you wish to share the > code > https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark > > I would be glad to benchmark your ANN implementation by means of > running some experiments that we run with the other ANN toolkits. I am > also interested in Autoencoder and have plans to implement it for MLLib > in the near future. > > Best regards, Alexander > > -----Original Message----- > From: Bert Greevenbosch [mailto:bert.greevenbo...@huawei.com] > Sent: Tuesday, July 01, 2014 7:14 AM > To: dev@spark.apache.org > Subject: RE: Artificial Neural Network in Spark? > > Hi Debasish, Alexander, all, > > Indeed I found the OpenDL project through the Powered by Spark page. > I'll need some time to look into the code, but on the first sight it > looks quite well-developed. I'll contact the author about this too. > > My own implementation (in Scala) works for multiple inputs and multiple > outputs. It implements a single hidden layer, the number of nodes in it > can be specified. > > The implementation is a general ANN implementation. As such, it should > be useable for an autoencoder too, since that is just an ANN with some > special input/output constraints. > > As said before, the implementation is built upon the linear regression > model and gradient descent implementation. However it did require some > tweaks: > > - The linear regression model only supports a single output "label" (as > Double). Since the ANN can have multiple outputs, it ignores the > "label" attribute, but for training divides the input vector into two > parts, the first part being the genuine input vector, the second the > target output vector. > > - The concatenation of input and target output vectors is only > internally, the training function takes as input an RDD with tuples of > two Vectors, one for each input and output. > > - The GradientDescend optimizer is re-used without modification. > > - I have made an even simpler updater than the SimpleUpdater, leaving > out the division by the square root of the number of iterations. The > SimpleUpdater can also be used, but I created this simpler one because > I like to plot the result every now and then, and then continue the > calculations. For this, I also wrote a training function with as input > the weights from the previous training session. > > - I created a ParallelANNModel similar to the LinearRegressionModel. > > - I created a new GeneralizedSteepestDescendAlgorithm class similar to > the GeneralizedLinearAlgorithm class. > > - Created some example code to test with 2D (1 input 1 output), 3D (2 > inputs 1 output) and 4D (1 input 3 outputs) functions. > > If there is interest, I would be happy to release the code. What would > be the best way to do this? Is there some kind of review process? > > Best regards, > Bert > > > > -----Original Message----- > > From: Debasish Das [mailto:debasish.da...@gmail.com] > > Sent: 27 June 2014 14:02 > > To: dev@spark.apache.org > > Subject: Re: Artificial Neural Network in Spark? > > > > Look into Powered by Spark page...I found a project there which used > > autoencoder functions...It's not updated for a long time now ! > > > > On Thu, Jun 26, 2014 at 10:51 PM, Ulanov, Alexander > > <alexander.ula...@hp.com > > > wrote: > > > > > Hi Bert, > > > > > > It would be extremely interesting. Do you plan to implement > > autoencoder as > > > well? It would be great to have deep learning in Spark. > > > > > > Best regards, Alexander > > > > > > 27.06.2014, в 4:47, "Bert Greevenbosch" > > > <bert.greevenbo...@huawei.com> > > > написал(а): > > > > > > > Hello all, > > > > > > > > I was wondering whether Spark/mllib supports Artificial Neural > > Networks > > > (ANNs)? > > > > > > > > If not, I am currently working on an implementation of it. I > > > > re-use > > the > > > code for linear regression and gradient descent as much as possible. > > > > > > > > Would the community be interested in such implementation? Or > maybe > > > somebody is already working on it? > > > > > > > > Best regards, > > > > Bert > > >