Hi,
I wonder if someone can help suggest a solution to my problem, I had a simple
process working using Strings and now
want to convert to RDD[Char], the problem is when I end up with a nested call
as follow:
1) Load a text file into an RDD[Char]
val inputRDD = sc.textFile(“myFile.txt”).flatMap(_.toIterator)
2) I have a method that takes two parameters:
object Foo
{
def myFunction(inputRDD: RDD[Char], int val) : RDD[Char] ...
3) I have a method that the driver process calls once its loaded the inputRDD
‘bar’ as follows:
def bar(inputRDD: Rdd[Char) : Int = {
val solutionSet = sc.parallelize(1 to alphabetLength toList).map(shift
=> (shift, Object.myFunction(inputRDD,shift)))
What I’m trying to do is take a list 1..26 and generate a set of tuples {
(1,RDD(1)), …. (26,RDD(26)) } which is the inputRDD passed through
the function above, but with different set of shift parameters.
In my original I could parallelise the algorithm fine, but my input string had
to be in a ‘String’ variable, I’d rather it be an RDD
(string could be large). I think the way I’m trying to do it above won’t work
because its a nested RDD call.
Can anybody suggest a solution?
Regards,
Mike Lewis
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]