We did the same, just switched back to 0.7 and problem is gone.
Anyway, we are in trouble :)

2014-11-20 15:54 GMT+03:00 Wei Li <[email protected]>:

> Hi Serega:
>
>     We have also tried the mahout 0.9 RecommenderJob, and also found the
> the result is not good either. We are now debugging into the source code to
> find the possible issues. So how about the output of mahout 0.7? we will
> switch to this version if the result is acceptable, thanks.
>
> Best
> Wei
>
> On Tue, Nov 4, 2014 at 8:00 PM, Serega Sheypak <[email protected]>
> wrote:
>
> > Hi, i used org.apache.mahout.cf.taste.hadoop.item.RecommenderJob in
> mahout
> > 0.7 (CDH4)
> > Here are parameters:
> > numRecommendations=1000
> > threshold=0.91
> > maxSimilaritiesPerItem=1000
> > maxPrefsPerUserInItemSimilarity=10
> > similarityClassname=SIMILARITY_LOGLIKELIHOOD
> >
> > Then I migrated to 0.9 (CDH5)
> > I've found one difference:
> > maxPrefsPerUserInItemSimilarity renamed to maxPrefsInItemSimilarity
> >
> > The other thing is how it works.
> > I see this output in 0.7:
> >
> > USER_RATINGS_NEGLECTED=14954083
> >
> > USER_RATINGS_USED=32355513
> >
> > =====
> >
> > COOCCURRENCES=72 503 210
> >
> > PRUNED_COOCCURRENCES=0
> >
> >
> > output in 0.9:
> >
> > NEGLECTED_OBSERVATIONS=39 175 989
> >
> > ROWS=4 937 362
> >
> > USED_OBSERVATIONS=10 840 138
> >
> > =====
> >
> >
> >
> org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters
> > COOCCURRENCES=17 645 029
> >
> > PRUNED_COOCCURRENCES=0
> >
> >
> > And 0.9 gives me awful result, just trash.
> >
> > I run  over the same dataset
> >
> > mahout 0.7 is on old production CDH4 cluster,
> >
> > mahout 0.9 is on new CDH5 cluster.
> >
> >
> >
> > Why there is so huge difference? Is there any possibility to fix it?
> >
>

Reply via email to