Re: Information on Spark UI

2014-06-11 Thread Shuo Xiang
Using MEMORY_AND_DISK_SER to persist the input RDD[Rating] seems to work right for me now. I'm testing on a larger dataset and will see how it goes. On Wed, Jun 11, 2014 at 9:56 AM, Neville Li wrote: > Does cache eviction affect disk storage level too? I tried cranking up > replication but stil

Re: Information on Spark UI

2014-06-11 Thread Neville Li
Does cache eviction affect disk storage level too? I tried cranking up replication but still seeing this. On Wednesday, June 11, 2014, Shuo Xiang wrote: > Daniel, > Thanks for the explanation. > > > On Wed, Jun 11, 2014 at 8:57 AM, Daniel Darabos < > daniel.dara...@lynxanalytics.com > > wrote:

Re: Information on Spark UI

2014-06-11 Thread Shuo Xiang
Daniel, Thanks for the explanation. On Wed, Jun 11, 2014 at 8:57 AM, Daniel Darabos < daniel.dara...@lynxanalytics.com> wrote: > About more succeeded tasks than total tasks: > - This can happen if you have enabled speculative execution. Some > partitions can get processed multiple times. > -

Re: Information on Spark UI

2014-06-11 Thread Daniel Darabos
About more succeeded tasks than total tasks: - This can happen if you have enabled speculative execution. Some partitions can get processed multiple times. - More commonly, the result of the stage may be used in a later calculation, and has to be recalculated. This happens if some of the results

Re: Information on Spark UI

2014-06-10 Thread Neville Li
We are seeing this issue as well. We run on YARN and see logs about lost executor. Looks like some stages had to be re-run to compute RDD partitions lost in the executor. We were able to complete 20 iterations with 20% full matrix but not beyond that (total > 100GB). On Tue, Jun 10, 2014 at 8:32

Re: Information on Spark UI

2014-06-10 Thread coderxiang
The executors shown "CANNOT FIND ADDRESS" are not listed in the Executors Tab on the top of the Spark UI. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Information-on-Spark-UI-tp7354p7355.html Sent from the Apache Spark User List mailing list archive at Na