;user@spark.apache.org"
Sent: Wednesday, January 28, 2015 11:44 AM
Subject: Re: MappedRDD signature
I think it's clear if you format your function reasonably:
mjpJobOrderRDD.map(line => {
val tokens = line.split("\t");
if (tokens.length == 164 && tokens(23)
I think it's clear if you format your function reasonably:
mjpJobOrderRDD.map(line => {
val tokens = line.split("\t");
if (tokens.length == 164 && tokens(23) != null) {
(tokens(23),tokens(7))
}
})
In some cases the function returns nothing, in some cases a tuple. The
return type is ther
hey guys
I am not following why this happens
DATASET===Tab separated values (164 columns)
Spark command 1val mjpJobOrderRDD =
sc.textFile("/data/cdr/cdr_mjp_joborder_raw")val mjpJobOrderColsPairedRDD =
mjpJobOrderRDD.map(line => { val tokens =
line.split("\t");(tokens(23),to