PySpark MLlib: py4j cannot find trainImplicitALSModel method

2015-07-08 Thread sooraj
. This error seems to be happening completely on the driver as I don't see any error on the Spark web interface. I have tried changing the spark.yarn.am.memory configuration value, but it doesn't help. Any suggestion on how to debug this will be very helpful. Thank you, Sooraj Here i

Re: PySpark MLlib: py4j cannot find trainImplicitALSModel method

2015-07-08 Thread sooraj
parameter. On 8 July 2015 at 12:35, sooraj wrote: > Hi, > > I am using MLlib collaborative filtering API on an implicit preference > data set. From a pySpark notebook, I am iteratively creating the matrix > factorization model with the aim of measuring the RMSE for each combination

Re: PySpark MLlib: py4j cannot find trainImplicitALSModel method

2015-07-08 Thread sooraj
a PC) to a remote Spark cluster. Not sure if that is possible. - Sooraj On 8 July 2015 at 15:31, Ashish Dutt wrote: > My apologies for double posting but I missed the web links that i followed > which are 1 > <http://ramhiser.com/2015/02/01/configuring-ipython-notebook-support-for-py

Re: mllib kmeans produce 1 large and many extremely small clusters

2015-08-10 Thread sooraj
Hi, The issue is very likely to be in the data or the transformations you apply, rather than anything to do with the Spark Kmeans API as such. I'd start debugging by doing a bit of exploratory analysis of the TFIDF vectors. That is, for instance, plot the distribution (histogram) of the TFIDF valu