most columns have different values,when you say preprocess do you mean using classifiers ?
my dataset is highly structured in nature so i dont understand how a classifier will work. On Dec 8, 2014 2:20 AM, "Pat Ferrel" <[email protected]> wrote: > If there is some “filter” column that flags one type of item or another > then yes. Otherwise you’ll have to preprocess your data for input. > > On Dec 7, 2014, at 2:27 PM, Yash Patel <[email protected]> wrote: > > Will cross recommendation still work considering item similarity checks > multiple columns for items and my dataset has only one column for items;it > contains different item ids. > > > > > On Sun, Dec 7, 2014 at 5:26 PM, Pat Ferrel <[email protected]> wrote: > > > To use cross-recommendations with multiple actions you may be able to get > > away with using the pre-packaged command line job “spark-itemsimilarity". > > At one point you said you were more interested in the Mahout Hadoop > > Mapreduce recommender, which cannot create these cross-recommendations. > > > > I don’t see any need to use the interactive Mahout or Spark shell. > Calling > > Scala from Java is pretty complex so I’d recommend starting from the > > running driver so you have a base of Scala code to start from. Calling > Java > > from Scala is dead simple, it’s done throughout Mahout code. This should > > help make Scala a little less daunting. I use IntelliJ and there should > be > > no problem using Eclipse in the same manner. > > > > > > On Dec 6, 2014, at 3:55 PM, Yash Patel <[email protected]> wrote: > > > > i have something that shows the user locations,however is it possible to > > implement this without using apache spark shell as i found it quite > > confusing to use without no examples. > > > > I have a windows environment and i am using java in eclipse luna to code > > the recommender. > > On Dec 6, 2014 9:09 PM, "Pat Ferrel" <[email protected]> wrote: > > > >> You can often think of or re-phase a piece of data (a column in your > >> interaction data) as an action, like “being at a location”. Then use > >> cross-cooccurrence to calculate a cross-indicator. So the location can > be > >> used to recommend purchases. > >> > >> If you do this, the location should be something that can have > >> cooccurrence, so instead of lat-lon some part of an address. Maybe > >> country+postal-code would be good. Something unique that identifies a > >> location where other users can be. > >> > >> > >> On Dec 5, 2014, at 11:10 AM, Ted Dunning <[email protected]> wrote: > >> > >> Cross recommendation can apply if you use the multiple kinds of columns > > to > >> impute actions relative to characteristics. That is, people at this > >> location buy this item. Then when you do the actual query, the query > >> contains detailed history of the person, but also recent location > > history. > >> > >> > >> > >> On Thu, Dec 4, 2014 at 7:17 AM, Yash Patel <[email protected]> > >> wrote: > >> > >>> Cross Recommendors dont seem applicable because this dataset doesn't > >>> represent different actions by a user,it just contains transaction > >>> history.(ie.customer id,item id,shipping location,sales amount of that > >>> item,item category etc) > >>> > >>> Maybe location,sales per item(similarity might lead to knowledge of > >> people > >>> who share same purchasing patterns) etc. > >>> > >>> > >>> On Wed, Dec 3, 2014 at 5:28 PM, Ted Dunning <[email protected]> > >> wrote: > >>> > >>>> On Wed, Dec 3, 2014 at 6:22 AM, Yash Patel <[email protected]> > >>>> wrote: > >>>> > >>>>> I have multiple different columns such as category,shipping > >>> location,item > >>>>> price,online user, etc. > >>>>> > >>>>> How can i use all these different columns and improve recommendation > >>>>> quality(ie.calculate more precise similarity between users by use of > >>>>> location,item price) ? > >>>>> > >>>> > >>>> For some kinds of information, you can build cross recommenders off of > >>> that > >>>> other information. That incorporates this other information in an > >>>> item-based system. > >>>> > >>>> Simply hand coding a similarity usually doesn't work well. The > problem > >>> is > >>>> that you don't really know which factors really represent actionable > > and > >>>> non-redundant user similarity. > >>>> > >>> > >> > >> > > > > > >
