I tested 2 different implementations to generate the predicted ranked
list...The first version uses a cartesian of user and product features and
then generates a predicted value for each (user,product) key...
The second version does a collect on the skinny matrix (most likely
products) and then br
There is a JIRA for it: https://issues.apache.org/jira/browse/SPARK-3066
The easiest case is when one side is small. If both sides are large,
this is a super-expensive operation. We can do block-wise cross
product and then find top-k for each user.
Best,
Xiangrui
On Thu, Nov 6, 2014 at 4:51 PM,
model.recommendProducts can only be called from the master then ? I have a
set of 20% users on whom I am performing the test...the 20% users are in a
RDD...if I have to collect them all to master node and then call
model.recommendProducts, that's a issue...
Any idea how to optimize this so that we
ALS model contains RDDs. So you cannot put `model.recommendProducts`
inside a RDD closure `userProductsRDD.map`. -Xiangrui
On Thu, Nov 6, 2014 at 4:39 PM, Debasish Das wrote:
> I reproduced the problem in mllib tests ALSSuite.scala using the following
> functions:
>
> val arrayPredict = u
I reproduced the problem in mllib tests ALSSuite.scala using the following
functions:
val arrayPredict = userProductsRDD.map{case(user,product) =>
val recommendedProducts = model.recommendProducts(user, products)
val productScore = recommendedProducts.find{x=>x.product
Was "user" presented in training? We can put a check there and return
NaN if the user is not included in the model. -Xiangrui
On Mon, Nov 3, 2014 at 5:25 PM, Debasish Das wrote:
> Hi,
>
> I am testing MatrixFactorizationModel.predict(user: Int, product: Int) but
> the code fails on userFeatures.l
Hi,
I am testing MatrixFactorizationModel.predict(user: Int, product: Int) but
the code fails on userFeatures.lookup(user).head
In computeRmse MatrixFactorizationModel.predict(RDD[(Int, Int)]) has been
called and in all the test-cases that API has been used...
I can perhaps refactor my code to d