Careful, dot products are sometimes called “cosine” is false. Cosine =
(x.dot(y)) /(norm(x)*norm(y)). That is not x.dot(y) unless the norms sum to
1.

On Sun, Feb 5, 2017 at 10:36 AM, Pat Ferrel <[email protected]> wrote:

> Nice, someone does read the math :-)
>
> Content: The type of personalized “content” indicators talked about in the
> slides are not supported by the Universal Recommender and have little value
> unless you have no collaborative filtering data. They can theoretically be
> mixed with other indicators but you have to have history of the content a
> user has preferred in some way and that can also be seen as CF data so that
> part of the theory has value in only very specific edge cases like
> personalized news, where stories mostly do not get enough events to use for
> CF. If this is your case  we can talk more. Most people have CF data and so
> content cannot be used in this way but can as “intrinic”.
>
> Intrinsic: These are things like categories, tags, subjects, even derived
> indicators like LDA Topics, or popularity. They are attached to items as
> metadata. These are supported by the UR in several ways including boosts
> and filters. Imagine an ecom use case where a user is looking at a piece of
> “clothing”, at the bottom of the page you show “people who bought this also
> bought these” but you want only clothing, not the occasional video of
> electronics item. The things at the bottom of the page are “item-based”
> recommendations, not personalized but could also be personalized—no matter.
> The point is that of all recommendations you want to show only items that
> have the “category”: [“clothing”]. So it you have attached this “intrinsic”
> indicator to items you can query for item or user based recs with category:
> clothing. You can filter all recommendations out that do not have the
> category or you can boost items that have the category, both are done by
> changing the “bias” value in the query. See this page:
> http://actionml.com/docs/ur_queries <http://actionml.com/docs/ur_queries>
>
> Collaborative Filtering based indicators. Are based on any action, bit of
> context, or profile info that you think may relate to the user’s taste or
> preferences. These are more correctly called indicators when they are
> gathered but they go through a correlation test, that checks if the
> individual events appear to correlate with the conversion/primary event. So
> after the test we call them correlators and they are attached to items. So
> CF correlators of several types may be attached to each item along with the
> Intrinsic correlators.
>
> The Universal Recommender creates a model of all items with all CF and
> Intrinsic Correlators attached in a Lucene Index to all items with
> correlators. The index allows very fast scalable KNN queries (using cosine
> similarity). So when you ask the UR for user-based recommendations for
> user-1 we look up the recent events of user-1 and use these to make a KNN
> query to Lucene (inside of Elasticsearch) for items that have similar
> correlators. If you ask for user-based recommendations but bias or boost
> clothing by 10, the UR will internally multiply the hit score for
> “clothing” by 10 and re-rank all results. This means that “clothing” will
> be favored in results but if there are no recs for clothing, other types of
> recs may be returned.
>
> Scores: These are literally the sum of “dot products” of all indictors
> with boosts accounted for. Dot products are sometimes called “cosine” since
> the cosine of the angle between two vectors is the dot product of the
> normalized vectors. Each indicator is a vector, if you refer back to the
> slides and the total score is the sum of one vector times the entire
> matrix. If you then sum the dot products it is the score for all items.
> Lucene actually does this but makes use of special indexing and the
> sparseness of the data and query. So the result from Lucene is the items
> that are K Nearest Neighbors to the indicator vectors in the query.
> Conceptually Lucene does this for all items in the index but it skips 99%
> of them and distributes queries to produce the answer very quickly. The
> math in the slides shows what you would get if you did the matrix math for
> all data and if you paginated and returned all recommendations you would
> get exactly the results in the slides, but all you care about are the top
> k—therefor KNN
>
> TLDR; After the model is created with Mahout the last phase of the matrix
> math, finding the most similar items done inside Elasticsearch so one query
> returns the top ranked results. The scores can be explained (by the math
> you read) but are of no real use, only the rank matters.
>
> BTW the CCO algorithm in partly implemented in Mahout with the last phase
> in Elasticsearch, and you can get community support for the Universal
> Recommender here: https://groups.google.com/forum/#!forum/actionml-user <
> https://groups.google.com/forum/#!forum/actionml-user>
>
>
> On Feb 5, 2017, at 12:42 AM, Peng Zhang <[email protected]> wrote:
>
> Hi,
>
> Suppose we have created three types of indicators (coocurrence, content and
> intrinsic) and indexed them into Ellastic Search (ES). Then we query on
> these three types of indicators of a user to get recommended items. How
> does Universal Tecommender rank the items recommended based on these three
> types of indicators?
>
> I have gone thru the slides on Universal Recommender created by Pat. It's
> very informative. Here is the link:
> https://www.slideshare.net/mobile/pferrel/unified-recommender-39986309
>
> Thanks
> -Peng
>
>

Reply via email to