Many Thanks Owen for the prompt replies. Will update the results on the quality of recommendations here.
-----Original Message----- From: Sean Owen [mailto:[email protected]] Sent: 18 October 2012 18:01 To: [email protected] Subject: Re: Pseudo-Inverse map reduce implementation So you have a factorization like A = X * Y' and you are looking for the right inverse of Y' (where Y is the item-feature matrix)? This is just Y * pinv(Y' * Y). Y' * Y takes a little work to compute, but can be done in one pass over the matrix. Y' * Y is just a 1000x1000 matrix which you can invert in memory quickly. Then it's another multiply. It shouldn't take 40 seconds -- but, it is also something you need not compute at request time every time. It's not going to affect things much to just periodically recompute that if you always want a completely up-to-date right-inverse, because Y won't change rapidly. Sean On Thu, Oct 18, 2012 at 1:21 PM, Ranjith Uthaman <[email protected]> wrote: > The final pursuit is building a content-based recommender of the item for > each user. User-based and item-based recommenders of mahout as discussed in > MahoutInAction book doesn't fare very well considering the data available. > Also, a content-based recommender approach is also hinted in the book. > Hence, We intend to use linear regression kind-of model for achieving better > recommendations. The confidential nature of data does not allow it to be > discussed here :-( , but the scale at which this needs to be performed is as > follows: > The number of users are : 5-10 million Number of items are : ~10000 > [which might increase to million in future] Feature vector of the item > is: 1000 [which might increase to 10000 features in future] > > We need to find the weight vector using the pseudo inverse of the item matrix > and essentially for per user the matrix dimensions is 10000 X 1000. But, > since the number of users are large and this needs to be done more frequent. > On a single desktop machine with 2-core and average configuration pinv of a > matrix of such dimension takes around 40 seconds . > This time is too long for customers using mobile web portals whose index page > is completely customised using the recommendations results obtained above. > Not to mention that , rendering of the results to create the page will take > further computational time. > > Kindly guide. > > Thanks & Regards, > Ranjith > > > -----Original Message----- > From: Sean Owen [mailto:[email protected]] > Sent: 18 October 2012 12:48 > To: [email protected] > Subject: Re: Pseudo-Inverse map reduce implementation > > I asked in reply on Quora -- what exactly are you computing? what is the size > of input and are you talking about a generalized inverse. > Depending on this there are easier ways than an SVD. > > On Thu, Oct 18, 2012 at 6:42 AM, Ranjith Uthaman <[email protected]> > wrote: >> Hi, >> >> Does map reduce implementation of Pseudo-Inverse of a matrix exist in the >> current Mahout framework? What are the various ways to achieve it? >> >> Thanks & Regards, >> RANJITH P UTHAMAN
