The final pursuit is building a content-based recommender of the item for each 
user. User-based and item-based recommenders of mahout as discussed in 
MahoutInAction book doesn't fare very well considering the data available. 
Also, a content-based recommender approach is also hinted in the book. 
 Hence, We intend to use linear regression kind-of model for achieving better 
recommendations. The confidential nature of data does not allow it to be 
discussed here :-( , but the scale at which this needs to be performed is as 
follows:
The number of users are : 5-10 million
Number of items are : ~10000 [which might increase to million in future]
Feature vector of the item is: 1000 [which might increase to 10000 features in 
future]  

We need to find the weight vector using the pseudo inverse of the item matrix 
and essentially for per user the matrix dimensions is 10000 X 1000. But, since 
the number of users are large and this needs to be done more frequent.
On a single desktop machine with 2-core and average configuration  pinv of a 
matrix of such dimension takes around 40 seconds  . 
This time is too long for customers using mobile web portals whose index page 
is  completely customised using  the recommendations results obtained above. 
Not to mention that , rendering of the results to create the page will take 
further computational time.

Kindly guide.

Thanks & Regards,
Ranjith


-----Original Message-----
From: Sean Owen [mailto:[email protected]] 
Sent: 18 October 2012 12:48
To: [email protected]
Subject: Re: Pseudo-Inverse map reduce implementation

I asked in reply on Quora -- what exactly are you computing? what is the size 
of input and are you talking about a generalized inverse.
Depending on this there are easier ways than an SVD.

On Thu, Oct 18, 2012 at 6:42 AM, Ranjith Uthaman <[email protected]> 
wrote:
> Hi,
>
> Does map reduce implementation of Pseudo-Inverse of a matrix exist in the 
> current Mahout framework? What are the various ways to achieve it?
>
> Thanks & Regards,
> RANJITH P UTHAMAN

Reply via email to