[GitHub] spark pull request: [MLLIB] SPARK-4231: Add RankingMetrics to exam...

debasish83 Wed, 05 Nov 2014 14:55:03 -0800

Github user debasish83 commented on the pull request:

    https://github.com/apache/spark/pull/3098#issuecomment-61896998
  
    @srowen RMSE for both user and product view are similar and so I did not 
call multiclass metric..also as per discussion before with @mengxr we are still 
not sure to consider recommendation as a multiclass classification.
    
    I refactored the code to compute both user and product MAP separately. I 
still use the old way to compute RMSE
    
    Here are the results on MovieLens:
    
    ./bin/spark-submit --master 
spark://tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars 
/Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
 --total-executor-cores 4 --executor-memory 4g --driver-memory 1g --class 
org.apache.spark.examples.mllib.MovieLensALS 
./examples/target/spark-examples_2.10-1.2.0-SNAPSHOT.jar --kryo --lambda 0.065 
hdfs://localhost:8020/sandbox/movielens/
    2014-11-05 14:42:31.287 java[11464:1903] Unable to load realm mapping info 
from SCDynamicStore
    14/11/05 14:42:31 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
    Got 1000209 ratings from 6040 users on 3706 movies.
    Training: 799710, test: 200499.
    Test RMSE = 0.8974937349724504 user MAP = 7.432139222366133 product MAP = 
12.203619639224904.
    
    Should I generate the testSet based on more specific random sampling ? 
Basically right now I am sampling the whole RDD for 20% test but I can also do 
user / product based sampling...did that help you before in Oryx ?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [MLLIB] SPARK-4231: Add RankingMetrics to exam...

Reply via email to