Hi,
Thanks for the prompt reply.
I checked the code. The main issue is the large number of mappers. If the
number of mappers is set to some number around 1000, there will be no
problem. I hope the bug gets fixed in the next releases.
On Mon, Jan 5, 2015 at 1:26 AM, Josh Rosen wrote:
> Ah, so I
Ah, so I guess this *is* still an issue since we needed to use a bitmap for
tracking zero-sized blocks (see
https://issues.apache.org/jira/browse/SPARK-3740; this isn't just a
performance issue; it's necessary for correctness). This will require a
bit more effort to fix, since we'll either have to
I use the 1.2 version.
On Sun, Jan 4, 2015 at 3:01 AM, Josh Rosen wrote:
> Which version of Spark are you using? It seems like the issue here is
> that the map output statuses are too large to fit in the Akka frame size.
> This issue has been fixed in Spark 1.2 by using a different encoding for
Which version of Spark are you using? It seems like the issue here is that
the map output statuses are too large to fit in the Akka frame size. This
issue has been fixed in Spark 1.2 by using a different encoding for map
outputs for jobs with many reducers (
https://issues.apache.org/jira/browse/
Hi,
I am trying to get the frequency of each Unicode char in a document
collection using Spark. Here is the code snippet that does the job:
JavaPairRDD rows = sc.sequenceFile(args[0],
LongWritable.class, Text.class);
rows = rows.coalesce(1);
JavaPairRDD pairs = rows.f