Hmm... each users recommendations seems to be about 2.8 x 20M Flops = 60M
Flops.  You should get about a Gflop per core in Java so this should about
60 ms.  You can make this faster with more cores or by using ATLAS.

Are you expecting 3 million unique people every 80 hours?  If no, then it
is probably more efficient to compute the recommendations on the fly.

How many recommendations per second are you expecting?  If you have 1
million uniques per day (just for grins) and we assume 20,000 s/day to
allow for peak loading, you have to do 50 queries per second peak.  This
seems to require 3 cores.  Use 16 to be safe.

Regarding the 80 hours, 3 million x 60ms = 180,000 seconds = 50 hours.  I
think that your map-reduce is under performing by about a factor of 10.
 This is quite plausible with bad arrangement of the inner loops.  I think
that you would have highest performance computing the recommendations for a
few thousand items by a few thousand users at a time.  It might be just
about as fast to do all items against a few users at a time.  The reason
for this is that dense matrix multiply requires c n x k + m x k memory ops,
but n x k x m arithmetic ops.  If you can re-use data many times, you can
balance memory channel bandwidth against CPU speed.  Typically you need 20
or more re-uses to really make this fly.


On Tue, Mar 5, 2013 at 4:15 PM, Josh Devins <[email protected]> wrote:

> Hi all,
>
> I have a conceptually simple problem. A user-item matrix, A, whose
> dimensions are ~2.6M rows x ~2.8M cols (~65M non-zeros). Running ALS with
> 20 features reduces this in the usual way to A = UM'. Trying to generate
> top-n (where n=100) recommendations for all users in U is quite a long
> process though. Essentially, for every user, it's generating a prediction
> for all unrated items in M then taking the top-n (all in-memory). I'm using
> the standard ALS `RecommenderJob` for this.
>
> Considering that there are ~2.6M users and ~2.8M items, this is a really,
> really, time consuming way to find the top-n recommendations for all users
> in U. I feel like there could be a tricky way to avoid having to compute
> all item predictions of a user though. I can't find any reference in papers
> about improving this but at the moment, the estimate (with 10 mappers
> running the `RecommenderJob`) is ~80 hours. When I think about this problem
> I wonder if applying kNN or local sensitive min-hashing would somehow help
> me. Basically find the nearest neighbours directly and calculate
> predictions on those items only and not every item in M. On the flip side,
> I could start to reduce the item space, since it's quite large, basically
> start removing items that have low in-degrees since these probably don't
> contribute too much to the final recommendations. I don't like this so much
> though as it could remove some of the long-tail recommendations. At least,
> that is my intuition :)
>
> Thoughts anyone?
>
> Thanks in advance,
>
> Josh
>

Reply via email to