from:"Edward Sargisson"

Experience with centralised logging for Spark?

2015-07-03 Thread Edward Sargisson

Hi all, I'm wondering if anybody as any experience with centralised logging for Spark - or even has felt that there was need for this given the WebUI. At my organization we use Log4j2 and Flume as the front end of our centralised logging system. I was looking into modifying Spark to use that syst

Application on standalone cluster never changes state to be stopped

2015-05-22 Thread Edward Sargisson

Hi, Environment: Spark standalone cluster running with a master and a work on a small Vagrant VM. The Jetty Webapp on the same node calls the spark-submit script to start the job. >From the contents of the stdout I can see that it's running successfully. However, the spark-submit process never see

Fwd: Re: spark 1.3.1 jars in repo1.maven.org

2015-05-20 Thread Edward Sargisson

encies are also causing issues with what jetty libraries are available in the classloader from Spark and don't clash with existing libraries we have. More anon, Cheers, Edward Original Message Subject: Re: spark 1.3.1 jars in repo1.maven.org Date: 2015-05-20 00:38 From: Sea

spark 1.3.1 jars in repo1.maven.org

2015-05-19 Thread Edward Sargisson

Hi, I'd like to confirm an observation I've just made. Specifically that spark is only available in repo1.maven.org for one Hadoop variant. The Spark source can be compiled against a number of different Hadoops using profiles. Yay. However, the spark jars in repo1.maven.org appear to be compiled a

Using groupByKey with Spark SQL

2015-05-15 Thread Edward Sargisson

Hi all, This might be a question to be answered or feedback for a possibly new feature depending: We have source data which is events about the state changes of an entity (identified by an ID) represented as nested JSON. We wanted to sessionize this data so that we had a collection of all the even

How do you use the thrift-server to get data from a Spark program?

2014-10-26 Thread Edward Sargisson

Hi all, This feels like a dumb question but bespeaks my lack of understanding: what is the Spark thrift-server for? Especially if there's an existing Hive installation. Background: We want to use Spark to do some processing starting from files (in probably MapRFS). We want to be able to read the r

Experience with centralised logging for Spark?

Application on standalone cluster never changes state to be stopped

Fwd: Re: spark 1.3.1 jars in repo1.maven.org

spark 1.3.1 jars in repo1.maven.org

Using groupByKey with Spark SQL

How do you use the thrift-server to get data from a Spark program?

6 matches

Site Navigation

Mail list logo

Footer information