RE: Accessing DataFrame inside UserDefinedFunction.

2017-11-05 Thread Anurag Verma
This is expected. You are not accessing the DataSet Dict when calling UDF countPositiveSimilarity. The dict dataframe as it existed when udf was created is encoded into udf. If you change dict later on the changes will not get automatically picked up in UDF countPositiveSimilarity. Sent from

Re: Add a machine learning algorithm to sparkml

2017-10-20 Thread anurag . verma
Manilos, There is also scope of enhancing existing ML algorithms. In particular for Neural Net/ MLP adding more activation functions like Relu/ Tanh. Also adding functionality for deep learning architecture like CNN or LSTM which are gaining popularity. This may be more feasible in terms of l

RE: Regularized Logistic regression

2016-10-13 Thread Anurag Verma
Probably your regularization parameter is set too high. Try regParam=0.1/ 0.2 Also you should probably increase the number to iteration to something like 500. Additionally you can specify elasticNetParam (between 0 and 1). -Original Message- From: aditya1702 [mailto:adityavya...@gmail.com