Re: MappedRDD signature

2015-01-28 Thread Sanjay Subramanian
;user@spark.apache.org" Sent: Wednesday, January 28, 2015 11:44 AM Subject: Re: MappedRDD signature I think it's clear if you format your function reasonably: mjpJobOrderRDD.map(line => {   val tokens = line.split("\t");   if (tokens.length == 164 && tokens(23)

Re: MappedRDD signature

2015-01-28 Thread Sean Owen
I think it's clear if you format your function reasonably: mjpJobOrderRDD.map(line => { val tokens = line.split("\t"); if (tokens.length == 164 && tokens(23) != null) { (tokens(23),tokens(7)) } }) In some cases the function returns nothing, in some cases a tuple. The return type is ther

MappedRDD signature

2015-01-28 Thread Sanjay Subramanian
hey guys  I am not following why this happens DATASET===Tab separated values (164 columns) Spark command 1val mjpJobOrderRDD = sc.textFile("/data/cdr/cdr_mjp_joborder_raw")val mjpJobOrderColsPairedRDD = mjpJobOrderRDD.map(line => { val tokens = line.split("\t");(tokens(23),to