No, it does not start from 0 and does not cover all number between 0 and the number of items/users. We do a prefiltering before (a user must have bought at lest 5 product and a product must have been bought by 3 users) we use Mahout on the dataset. Therefore we start with user 3, then it jumps to user 5, etc.
Is this wrong? Should we use all data as input to Mahout and do the filtring inside Mahout? We use the second latest version of Mahout! Best regards, Niklas On Tuesday, November 24, 2015, Pat Ferrel <[email protected] <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > Do your ids start with 0 and cover all numbers between 0 and the number of > items -1 (same for user ids)? > The old hadoop-mahout code required ordinal ids starting at 0 > > > On Nov 24, 2015, at 8:19 AM, Niklas Ekvall <[email protected]> > wrote: > > Hi Pat, > > Here is some input: > > 3 7414 > 3 12682 > 3 18947 > 3 19980 > 3 26975 > 3 54635 > 3 67789 > 3 73212 > 3 118932 > 3 138846 > 3 141268 > 5 3 > 5 2123 > 5 37955 > 5 39975 > 5 113289 > 6 3 > 6 456 > 6 2188 > 6 2496 > 6 6194 > 6 6361 > 6 6768 > 6 6919 > 6 6920 > 6 7257 > 6 7705 > 6 7706 > 6 11788 > > And some output: > > 3 > > [122086:1.0,1846:1.0,74638:1.0,63240:1.0,87540:1.0,2742:1.0,2981:1.0,8325:1.0,145598:1.0,49675:1.0,131388:1.0,72113:1.0,3493:1.0,56131:1.0,30422:1.0,87829:1.0,111190:1.0,13597:1.0,83436:1.0,61772:1.0] > 5 > > [32349:1.0,29413:1.0,111896:1.0,61845:1.0,50016:1.0,1607:1.0,15237:1.0,133229:1.0,65805:1.0,34034:1.0,133071:1.0,28894:1.0,18658:1.0,32095:1.0,4402:1.0,47522:1.0,31022:1.0,23936:1.0,6243:1.0,53214:1.0] > 6 > > [40756:1.0,34420:1.0,31153:1.0,114717:1.0,53945:1.0,71148:1.0,26095:1.0,112941:1.0,55284:1.0,111346:1.0,112201:1.0,65759:1.0,133127:1.0,61378:1.0,16413:1.0,113289:1.0,49675:1.0,14995:1.0,141028:1.0,27506:1.0] > > Best regards, Niklas > > 2015-11-24 16:48 GMT+01:00 Pat Ferrel <[email protected]>: > > > Sounds like you may not have the input right. Recommendations should be > > sorted by the strength and so shouldn’t all be 1 unless the data is very > > odd. > > > > Can you give us a small sample of the input? > > > > > > BTW a newer recommender using Mahout’s Spark based code and a search > > engine is here: > > > https://github.com/PredictionIO/template-scala-parallel-universal-recommendation > > a single machine install script is here: > https://docs.prediction.io/start/ > > > > On Nov 24, 2015, at 2:16 AM, Niklas Ekvall <[email protected]> > > wrote: > > > > Hello Mahout Users! > > > > I use today Mahout - Recommenditembased with Log-similarity to produce > > personal recommendations for Trigger Eamils in a offline mode. But when I > > produce e.g. 50 recommendations the rank value of the recommendations are > > always of magnitude 1. Why is this so? And, is the first recommendations > in > > this list the best one or is there some randomness in this list? > > > > Best regards, > > > > Niklas Ekvall > > > > > >
