Hello; 
I am using the ALS recommendation MLLibb. To select the optimal rank, I have
a number of users who used multiple items as my test. I then get the
prediction on these users and compare it to the observed. I use 
the  RegressionMetrics to estimate the R^2. 
I keep getting a negative value. 
r2 =   -1.18966999676 explained var =  -1.18955347415 count =  11620309
Here is my Pyspark code :

train1.cache()
test1.cache()

numIterations =10
for i in range(10) :
        rank = int(40+i*10)
        als = ALS(rank=rank, maxIter=numIterations,implicitPrefs=False)
        model = als.fit(train1)
        predobs =
model.transform(test1).select("prediction","rating").map(lambda p :
(p.prediction,p.rating)).filter(lambda p: (math.isnan(p[0]) == False))
        metrics = RegressionMetrics(predobs)
        mycount = predobs.count()
        myr2 = metrics.r2
        myvar = metrics.explainedVariance
        print "hooo",rank, " r2 =  ",myr2, "explained var = ", myvar, "count
= ",mycount




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-can-the-RegressionMetrics-produce-negative-R2-and-explained-variance-tp23779.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to