Thanks Pat, very interesting indeed. On Tue, Nov 3, 2015 at 6:20 PM, Pat Ferrel <[email protected]> wrote:
> A colleague of mine just build a MAP@k precision evaluator for the Mahout > based cooccurrence recommender we’ve been working on and we ran some data > scraped from rottentomatoes.com <http://rottentomatoes.com/> They have > “fresh” and “rotten” reviews tied to reviewer ids. > > A fair bit of discussion has gone on about how to use negative > preferences. We have been saying that negative preferences might be > predictive of positive preferences and the cross-cooccurrence code in the > new SimilarityAnalysis.cooccurrence method can make the data usable. > > We took the RT data for two “actions”: “fresh" as the primary, the best > indicator of preference, and “rotten” as the secondary indicator. We found > that MAP using only “fresh” was bettered by almost 20% when we included > “rotten” as the secondary cross-cooccorrence action. For the strict out > there we did not directly isolate the two actions, which is work remaining > so some of the lift might be due to just having more data but it’s a really > good first step because more data doesn't always translate to better > performance and anyway it’s data you wouldn’t have otherwise. > > This opens up a new way to compare all sorts of other user signals, some > long considered to be unusable by recommenders. Gender, location, category > preferences are now fair game for testing. > > BTW we used this recommender, which is based on Mahout Samsara’s matrix > math, cooccurrence and LLR. > https://github.com/pferrel/scala-parallel-universal-recommendation < > https://github.com/pferrel/scala-parallel-universal-recommendation>
