BTW you may be able to just run the same csv through multiple times and pick a different item-ID column for each “action”. BTW here “csv” means a text file with some delimeter, not the full spec csv with headers, quoted values, and escaped characters.
On Dec 8, 2014, at 4:11 PM, Pat Ferrel <[email protected]> wrote: No classifier, just turn the one csv into several, each being a collection for one action. user ID,item ID Where the item ID is whatever the action corresponds too. For instance a <user ID>,<location ID> for being at a location or <user ID>,<item ID> for a purchase etc. These can go directly into the command line of spark-itemsimilarity. --input will always be the file with purchase, --input2 will be the file with the secondary action. On Dec 8, 2014, at 1:22 AM, Yash Patel <[email protected]> wrote: most columns have different values,when you say preprocess do you mean using classifiers ? my dataset is highly structured in nature so i dont understand how a classifier will work. On Dec 8, 2014 2:20 AM, "Pat Ferrel" <[email protected]> wrote: > If there is some “filter” column that flags one type of item or another > then yes. Otherwise you’ll have to preprocess your data for input. > > On Dec 7, 2014, at 2:27 PM, Yash Patel <[email protected]> wrote: > > Will cross recommendation still work considering item similarity checks > multiple columns for items and my dataset has only one column for items;it > contains different item ids. > > > > > On Sun, Dec 7, 2014 at 5:26 PM, Pat Ferrel <[email protected]> wrote: > >> To use cross-recommendations with multiple actions you may be able to get >> away with using the pre-packaged command line job “spark-itemsimilarity". >> At one point you said you were more interested in the Mahout Hadoop >> Mapreduce recommender, which cannot create these cross-recommendations. >> >> I don’t see any need to use the interactive Mahout or Spark shell. > Calling >> Scala from Java is pretty complex so I’d recommend starting from the >> running driver so you have a base of Scala code to start from. Calling > Java >> from Scala is dead simple, it’s done throughout Mahout code. This should >> help make Scala a little less daunting. I use IntelliJ and there should > be >> no problem using Eclipse in the same manner. >> >> >> On Dec 6, 2014, at 3:55 PM, Yash Patel <[email protected]> wrote: >> >> i have something that shows the user locations,however is it possible to >> implement this without using apache spark shell as i found it quite >> confusing to use without no examples. >> >> I have a windows environment and i am using java in eclipse luna to code >> the recommender. >> On Dec 6, 2014 9:09 PM, "Pat Ferrel" <[email protected]> wrote: >> >>> You can often think of or re-phase a piece of data (a column in your >>> interaction data) as an action, like “being at a location”. Then use >>> cross-cooccurrence to calculate a cross-indicator. So the location can > be >>> used to recommend purchases. >>> >>> If you do this, the location should be something that can have >>> cooccurrence, so instead of lat-lon some part of an address. Maybe >>> country+postal-code would be good. Something unique that identifies a >>> location where other users can be. >>> >>> >>> On Dec 5, 2014, at 11:10 AM, Ted Dunning <[email protected]> wrote: >>> >>> Cross recommendation can apply if you use the multiple kinds of columns >> to >>> impute actions relative to characteristics. That is, people at this >>> location buy this item. Then when you do the actual query, the query >>> contains detailed history of the person, but also recent location >> history. >>> >>> >>> >>> On Thu, Dec 4, 2014 at 7:17 AM, Yash Patel <[email protected]> >>> wrote: >>> >>>> Cross Recommendors dont seem applicable because this dataset doesn't >>>> represent different actions by a user,it just contains transaction >>>> history.(ie.customer id,item id,shipping location,sales amount of that >>>> item,item category etc) >>>> >>>> Maybe location,sales per item(similarity might lead to knowledge of >>> people >>>> who share same purchasing patterns) etc. >>>> >>>> >>>> On Wed, Dec 3, 2014 at 5:28 PM, Ted Dunning <[email protected]> >>> wrote: >>>> >>>>> On Wed, Dec 3, 2014 at 6:22 AM, Yash Patel <[email protected]> >>>>> wrote: >>>>> >>>>>> I have multiple different columns such as category,shipping >>>> location,item >>>>>> price,online user, etc. >>>>>> >>>>>> How can i use all these different columns and improve recommendation >>>>>> quality(ie.calculate more precise similarity between users by use of >>>>>> location,item price) ? >>>>>> >>>>> >>>>> For some kinds of information, you can build cross recommenders off of >>>> that >>>>> other information. That incorporates this other information in an >>>>> item-based system. >>>>> >>>>> Simply hand coding a similarity usually doesn't work well. The > problem >>>> is >>>>> that you don't really know which factors really represent actionable >> and >>>>> non-redundant user similarity. >>>>> >>>> >>> >>> >> >> > >
