To pick up an old thread… A = views items x users B = purchases items x users
A cross recommender B'A h_v + B'B h_p = r_p The B'B h_p is the basic boolean mahout recommender trained on purchases and we'll use that implementation I assume. B'A gives cooccurrences of views and purchases multiplying by the user history of views h_v you get a prediction of purchase preferences cross recommended by view. The same can be done for other non-purchase actions. The partial vectors then are summed, sorted and the top item-value pairs returned as recs. Hopefully I'm OK so far. Now on to implementation. We'd like both user history based recs and perhaps more importantly item history based recs, so similar in purchase actions or in this case views that cooccur with purchases. [B'A] h_v is a model, built from the two action matrixes and is a sparse matrix, times a users view history sparse column vector. Seems like a pre-calculated thing because the calc will be time consuming for each vector. But how to calc the item to item similarity? Precalc all pairwise similarities so they are just a runtime lookup? Also quite time consuming but fast at runtime Here is where I'm fuzzy. To use Lucene it seems we would take B'A and index it, (a field per value?) by row (or is it column?), then use the original row corresponding to the item in question and taken from B'A as the query. Lucene would find the most similar and should be pretty fast so we would not need to pre-calculate. Any corrections are appreciated.