We did the same, just switched back to 0.7 and problem is gone. Anyway, we are in trouble :)
2014-11-20 15:54 GMT+03:00 Wei Li <[email protected]>: > Hi Serega: > > We have also tried the mahout 0.9 RecommenderJob, and also found the > the result is not good either. We are now debugging into the source code to > find the possible issues. So how about the output of mahout 0.7? we will > switch to this version if the result is acceptable, thanks. > > Best > Wei > > On Tue, Nov 4, 2014 at 8:00 PM, Serega Sheypak <[email protected]> > wrote: > > > Hi, i used org.apache.mahout.cf.taste.hadoop.item.RecommenderJob in > mahout > > 0.7 (CDH4) > > Here are parameters: > > numRecommendations=1000 > > threshold=0.91 > > maxSimilaritiesPerItem=1000 > > maxPrefsPerUserInItemSimilarity=10 > > similarityClassname=SIMILARITY_LOGLIKELIHOOD > > > > Then I migrated to 0.9 (CDH5) > > I've found one difference: > > maxPrefsPerUserInItemSimilarity renamed to maxPrefsInItemSimilarity > > > > The other thing is how it works. > > I see this output in 0.7: > > > > USER_RATINGS_NEGLECTED=14954083 > > > > USER_RATINGS_USED=32355513 > > > > ===== > > > > COOCCURRENCES=72 503 210 > > > > PRUNED_COOCCURRENCES=0 > > > > > > output in 0.9: > > > > NEGLECTED_OBSERVATIONS=39 175 989 > > > > ROWS=4 937 362 > > > > USED_OBSERVATIONS=10 840 138 > > > > ===== > > > > > > > org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters > > COOCCURRENCES=17 645 029 > > > > PRUNED_COOCCURRENCES=0 > > > > > > And 0.9 gives me awful result, just trash. > > > > I run over the same dataset > > > > mahout 0.7 is on old production CDH4 cluster, > > > > mahout 0.9 is on new CDH5 cluster. > > > > > > > > Why there is so huge difference? Is there any possibility to fix it? > > >
