Re: ALS array index out of bound with 50 factors

2014-04-07 Thread Xiangrui Meng
Hi Deb, It would be helpful if you can attached the logs. It is strange to see that you can make 4 iterations but not 10. Xiangrui On Mon, Apr 7, 2014 at 10:36 AM, Debasish Das wrote: > I am using master... > > No negative indexes... > > If I run with 4 iterations it runs fine and I can generat

Re: ALS array index out of bound with 50 factors

2014-04-07 Thread Debasish Das
I am using master... No negative indexes... If I run with 4 iterations it runs fine and I can generate factors... With 10 iterations run fails with array index out of bound... 25m users and 3m products are within int limits Does it help if I can point the logs for both the runs to you ? I

Re: ALS array index out of bound with 50 factors

2014-04-07 Thread Xiangrui Meng
Hi Deb, This thread is for the out-of-bound error you described. I don't think the number of iterations has any effect here. My questions were: 1) Are you using the master branch or a particular commit? 2) Do you have negative or out-of-integer-range user or product ids? Try to print out the max

Re: ALS array index out of bound with 50 factors

2014-04-07 Thread Debasish Das
Nick, I already have this code which calls dictionary generation and then maps string etc to ints...I think the core algorithm should stay in ints...if you like I can add this code in MFUtils.scalathat's the convention I followed similar to MLUtils.scala...actually these functions should be ev

Re: ALS array index out of bound with 50 factors

2014-04-07 Thread Nick Pentreath
On the partitioning / id keys. If we would look at hash partitioning, how feasible will it be to just allow the user and item ids to be strings? A lot of the time these ids are strings anyway (UUIDs and so on), and it's really painful to translate between String <-> Int the whole time. Are there a

Re: ALS array index out of bound with 50 factors

2014-04-06 Thread Debasish Das
Hi Xiangrui, With 4 ALS iterations it runs fine...If I run 10 I am failing...I believe I have to cut the lineage chain and call checkpointTrying to follow the other email chain on checkpointing... Thanks. Deb On Sun, Apr 6, 2014 at 9:08 PM, Xiangrui Meng wrote: > Hi Deb, > > Are you using

Re: ALS array index out of bound with 50 factors

2014-04-06 Thread Xiangrui Meng
Hi Deb, Are you using the master branch or a particular commit? Do you have negative or out-of-integer-range user or product ids? There is an issue with ALS' partitioning (https://spark-project.atlassian.net/browse/SPARK-1281), but I'm not sure whether that is the reason. Could you try to see whet

ALS array index out of bound with 50 factors

2014-04-05 Thread Debasish Das
Hi, I deployed apache/spark master today and recently there were many ALS related checkins and enhancements.. I am running ALS with explicit feedback and I remember most enhancements were related to implicit feedback... With 25 factors my runs were successful but with 50 factors I am getting arr