Hi everyone,

I am new to Spark and I'm having problems to make my code compile. I have
the feeling I might be misunderstanding the functions so I would be very
glad to get some insight in what could be wrong.

The problematic code is the following:

JavaRDD<Body> bodies = lines.map(l -> {Body b = new Body(); b.parse(l);} );

JavaPairRDD<Partition, Iterable<Body>> partitions =
                    bodies.mapToPair(b ->
b.computePartitions(maxDistance)).groupByKey();

Partition and Body are defined inside the driver class. Body contains the
following definition:

protected Iterable<Tuple2<Partition, Body>> computePartitions (int
maxDistance)

The idea is to reproduce the following schema:

The first map results in: *body1, body2, ... *
The mapToPair should output several of these:* (partition_i, body1),
(partition_i, body2)...*
Which are gathered by key as follows: *(partition_i, (body1, body_n....),
(partition_i', (body2, body_n') ...*

Thanks in advance.
Regards,
Silvina

Reply via email to