Hi all,
I'm wondering if anybody as any experience with centralised logging for
Spark - or even has felt that there was need for this given the WebUI.
At my organization we use Log4j2 and Flume as the front end of our
centralised logging system. I was looking into modifying Spark to use that
syst
Hi,
Environment: Spark standalone cluster running with a master and a work on a
small Vagrant VM. The Jetty Webapp on the same node calls the spark-submit
script to start the job.
>From the contents of the stdout I can see that it's running successfully.
However, the spark-submit process never see
encies are also causing issues with what jetty
libraries are available in the classloader from Spark and don't clash with
existing libraries we have.
More anon,
Cheers,
Edward
Original Message
Subject: Re: spark 1.3.1 jars in repo1.maven.org Date: 2015-05-20 00:38
From: Sea
Hi,
I'd like to confirm an observation I've just made. Specifically that spark
is only available in repo1.maven.org for one Hadoop variant.
The Spark source can be compiled against a number of different Hadoops
using profiles. Yay.
However, the spark jars in repo1.maven.org appear to be compiled a
Hi all,
This might be a question to be answered or feedback for a possibly new
feature depending:
We have source data which is events about the state changes of an entity
(identified by an ID) represented as nested JSON.
We wanted to sessionize this data so that we had a collection of all the
even
Hi all,
This feels like a dumb question but bespeaks my lack of understanding: what
is the Spark thrift-server for? Especially if there's an existing Hive
installation.
Background:
We want to use Spark to do some processing starting from files (in probably
MapRFS). We want to be able to read the r