Re: confused by reduceByKey usage

2014-04-17 Thread 诺铁
got it, thank you. On Fri, Apr 18, 2014 at 9:55 AM, Cheng Lian wrote: > Ah, I’m not saying println is bad, it’s just that you need to go to the > right place to locate the output, e.g. you can check stdout of any executor > from the Web UI. > > > On Fri, Apr 18, 2014 at 9:48 AM, 诺铁 wrote: > >>

Re: confused by reduceByKey usage

2014-04-17 Thread Cheng Lian
Ah, I’m not saying println is bad, it’s just that you need to go to the right place to locate the output, e.g. you can check stdout of any executor from the Web UI. On Fri, Apr 18, 2014 at 9:48 AM, 诺铁 wrote: > hi,Cheng, > > thank you for let me know this. so what do you think is better way to

Re: confused by reduceByKey usage

2014-04-17 Thread 诺铁
hi,Cheng, thank you for let me know this. so what do you think is better way to debug? On Fri, Apr 18, 2014 at 9:27 AM, Cheng Lian wrote: > A tip: using println is only convenient when you are working with local > mode. When running Spark in clustering mode (standalone/YARN/Mesos), output >

Re: confused by reduceByKey usage

2014-04-17 Thread Cheng Lian
A tip: using println is only convenient when you are working with local mode. When running Spark in clustering mode (standalone/YARN/Mesos), output of println goes to executor stdout. On Fri, Apr 18, 2014 at 6:53 AM, 诺铁 wrote: > yeah, I got it.! > using println to debug is great for me to explo

Re: confused by reduceByKey usage

2014-04-17 Thread 诺铁
yeah, I got it.! using println to debug is great for me to explore spark. thank you very much for your kindly help. On Fri, Apr 18, 2014 at 12:54 AM, Daniel Darabos < daniel.dara...@lynxanalytics.com> wrote: > Here's a way to debug something like this: > > scala> d5.keyBy(_.split(" ")(0)).reduc

Re: confused by reduceByKey usage

2014-04-17 Thread Daniel Darabos
Here's a way to debug something like this: scala> d5.keyBy(_.split(" ")(0)).reduceByKey((v1,v2) => { println("v1: " + v1) println("v2: " + v2) (v1.split(" ")(1).toInt + v2.split(" ")(1).toInt).toString }).collect You get: v1: 1 2 3 4 5 v2: 1 2 3 4 5 v1: 4 v

confused by reduceByKey usage

2014-04-17 Thread 诺铁
HI, I am new to spark,when try to write some simple tests in spark shell, I met following problem. I create a very small text file,name it as 5.txt 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 and experiment in spark shell: scala> val d5 = sc.textFile("5.txt").cache() d5: org.apache.spark.rdd.RDD[String] = Ma