yeah, I got it.!
using println to debug is great for me to explore spark.
thank you very much for your kindly help.



On Fri, Apr 18, 2014 at 12:54 AM, Daniel Darabos <
daniel.dara...@lynxanalytics.com> wrote:

> Here's a way to debug something like this:
>
> scala> d5.keyBy(_.split(" ")(0)).reduceByKey((v1,v2) => {
>            println("v1: " + v1)
>            println("v2: " + v2)
>            (v1.split(" ")(1).toInt + v2.split(" ")(1).toInt).toString
>        }).collect
>
> You get:
> v1: 1 2 3 4 5
> v2: 1 2 3 4 5
> v1: 4
> v2: 1 2 3 4 5
> java.lang.ArrayIndexOutOfBoundsException: 1
>
> reduceByKey() works kind of like regular Scala reduce(). So it will call
> the function on the first two values, then on the result of that and the
> next value, then the result of that and the next value, and so on. First
> you add 2+2 and get 4. Then your function is called with v1="4" and v2 is
> the third line.
>
> What you could do instead:
>
> scala> d5.keyBy(_.split(" ")(0)).mapValues(_.split("
> ")(1).toInt).reduceByKey((v1, v2) => v1 + v2).collect
>
>
> On Thu, Apr 17, 2014 at 6:29 PM, 诺铁 <noty...@gmail.com> wrote:
>
>> HI,
>>
>> I am new to spark,when try to write some simple tests in spark shell, I
>> met following problem.
>>
>> I create a very small text file,name it as 5.txt
>> 1 2 3 4 5
>> 1 2 3 4 5
>> 1 2 3 4 5
>>
>> and experiment in spark shell:
>>
>> scala> val d5 = sc.textFile("5.txt").cache()
>> d5: org.apache.spark.rdd.RDD[String] = MappedRDD[91] at textFile at
>> <console>:12
>>
>> scala> d5.keyBy(_.split(" ")(0)).reduceByKey((v1,v2) => (v1.split("
>> ")(1).toInt + v2.split(" ")(1).toInt).toString).first
>>
>> then error occurs:
>> 14/04/18 00:20:11 ERROR Executor: Exception in task ID 36
>> java.lang.ArrayIndexOutOfBoundsException: 1
>> at $line60.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$2.apply(<console>:15)
>>  at $line60.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$2.apply(<console>:15)
>> at
>> org.apache.spark.util.collection.ExternalAppendOnlyMap$$anonfun$2.apply(ExternalAppendOnlyMap.scala:120)
>>
>> when I delete 1 line in the file, and make it 2 lines,the result is
>> correct, I don't understand what's the problem, please help me,thanks.
>>
>>
>

Reply via email to