Here's what the console shows:
15/01/01 01:12:29 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 58.0,
whose tasks have all completed, from pool
15/01/01 01:12:29 INFO scheduler.DAGScheduler: Stage 58 (runJob at
ParquetTableOperations.scala:326) finished in 5493.549 s
15/01/01 01:12:29 INFO sche
If you've been following AMPLab Jenkins today, you'll notice that there's
been a huge number of Spark test failures in the maintenance branches and
Maven builds.
My best guess as to what's causing this is that I pushed a backport to all
maintenance branches at a moment where Jenkins was otherwise
This was not intended, can you open a JIRA?
On Tue, Dec 30, 2014 at 8:40 PM, Ted Yu wrote:
> I extracted org/apache/hadoop/hive/common/CompressionUtils.class from the
> jar and used hexdump to view the class file.
> Bytes 6 and 7 are 00 and 33, respectively.
>
> According to http://en.wikipedia.
-dev, +user
A decent guess: Does your 'save' function entail collecting data back
to the driver? and are you running this from a machine that's not in
your Spark cluster? Then in client mode you're shipping data back to a
less-nearby machine, compared to with cluster mode. That could explain
the b
Hi,
I have a very, very simple streaming job. When I deploy this on the exact
same cluster, with the exact same parameters, I see big (40%) performance
difference between "client" and "cluster" deployment mode. This seems a bit
surprising.. Is this expected?
The streaming job is:
val msgStre