Re: MLlib - Collaborative Filtering - trainImplicit task size

2015-04-27 Thread Xiangrui Meng
Could you try different ranks and see whether the task size changes? We do use YtY in the closure, which should work the same as broadcast. If that is the case, it should be safe to ignore this warning. -Xiangrui On Thu, Apr 23, 2015 at 4:52 AM, Christian S. Perone wrote: > All these warnings com

Re: MLlib - Collaborative Filtering - trainImplicit task size

2015-04-23 Thread Christian S. Perone
All these warnings come from ALS iterations, from flatMap and also from aggregate, for instance the origin of the state where the flatMap is showing these warnings (w/ Spark 1.3.0, they are also shown in Spark 1.3.1): org.apache.spark.rdd.RDD.flatMap(RDD.scala:296) org.apache.spark.ml.recommendati

Re: MLlib - Collaborative Filtering - trainImplicit task size

2015-04-22 Thread Xiangrui Meng
This is the size of the serialized task closure. Is stage 246 part of ALS iterations, or something before or after it? -Xiangrui On Tue, Apr 21, 2015 at 10:36 AM, Christian S. Perone wrote: > Hi Sean, thanks for the answer. I tried to call repartition() on the input > with many different sizes an

Re: MLlib - Collaborative Filtering - trainImplicit task size

2015-04-21 Thread Christian S. Perone
Hi Sean, thanks for the answer. I tried to call repartition() on the input with many different sizes and it still continues to show that warning message. On Tue, Apr 21, 2015 at 7:05 AM, Sean Owen wrote: > I think maybe you need more partitions in your input, which might make > for smaller tasks

Re: MLlib - Collaborative Filtering - trainImplicit task size

2015-04-21 Thread Sean Owen
I think maybe you need more partitions in your input, which might make for smaller tasks? On Tue, Apr 21, 2015 at 2:56 AM, Christian S. Perone wrote: > I keep seeing these warnings when using trainImplicit: > > WARN TaskSetManager: Stage 246 contains a task of very large size (208 KB). > The maxi

MLlib - Collaborative Filtering - trainImplicit task size

2015-04-20 Thread Christian S. Perone
I keep seeing these warnings when using trainImplicit: WARN TaskSetManager: Stage 246 contains a task of very large size (208 KB). The maximum recommended task size is 100 KB. And then the task size starts to increase. Is this a known issue ? Thanks ! -- Blog

Re: MLlib -Collaborative Filtering

2015-04-19 Thread Nick Pentreath
> algorithm? by that i mean the similarity between 2 users >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Collaborative-Filterin

Re: MLlib -Collaborative Filtering

2015-04-19 Thread Christian S. Perone
gt; > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Collaborative-Filtering-tp22553.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ---

MLlib -Collaborative Filtering

2015-04-18 Thread riginos
Is there any way that i can see the similarity table of 2 users in that algorithm? by that i mean the similarity between 2 users -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Collaborative-Filtering-tp22553.html Sent from the Apache Spark User List

Re: MLlib -Collaborative Filtering

2015-04-18 Thread Nick Pentreath
e in context: > http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Collaborative-Filtering-tp22552.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > - > To unsubscrib

MLlib -Collaborative Filtering

2015-04-18 Thread riginos
Is there any way that i can see the similarity table of 2 users in that algorithm? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Collaborative-Filtering-tp22552.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: MLlib Collaborative Filtering failed to run with rank 1000

2014-10-03 Thread Xiangrui Meng
n a fixed amount > of hardware. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Collaborative-Filtering-failed-to-run-with-rank-1000-tp15692p15697.html > Sent from the Apa

Re: MLlib Collaborative Filtering failed to run with rank 1000

2014-10-03 Thread jw.cmu
sage in context: http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Collaborative-Filtering-failed-to-run-with-rank-1000-tp15692p15697.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsu

Re: MLlib Collaborative Filtering failed to run with rank 1000

2014-10-03 Thread Xiangrui Meng
Spark version: 1.0.2 > Number of workers: 9 > core per worker: 16 > memory per worker: 120GB > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Collaborative-Filtering-failed-to-run-with-rank-1000-tp15692.html > Sen

MLlib Collaborative Filtering failed to run with rank 1000

2014-10-03 Thread jw.cmu
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) Spark version: 1.0.2 Number of workers: 9 core per worker: 16 memory per worker: 120GB -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Collaborative-Filtering-failed-to-run-with-rank-1000