from:"James King"

spark ssh to slave

2015-06-08 Thread James King

I have two hosts 192.168.1.15 (Master) and 192.168.1.16 (Worker) These two hosts have exchanged public keys so they have free access to each other. But when I do /sbin/start-all.sh from 192.168.1.15 I still get 192.168.1.16: Permission denied (publickey,gssapi-keyex,gssapi-with-mic). any though

Re: spark ssh to slave

2015-06-08 Thread James King

Thanks Akhil, yes that works fine it just lets me straight in. On Mon, Jun 8, 2015 at 11:58 AM, Akhil Das wrote: > Can you do *ssh -v 192.168.1.16* from the Master machine and make sure > its able to login without password? > > Thanks > Best Regards > > On Mon, Jun 8, 2

Spark + Kafka

2015-03-18 Thread James King

Hi All, Which build of Spark is best when using Kafka? Regards jk

Re: Spark + Kafka

2015-03-18 Thread James King

se "Reply to all". If you're not including the mailing > list in the response, I'm the only one who will get your message. > > Regards, > Jeff > > 2015-03-18 10:49 GMT+01:00 James King : > >> Any sub-category recommendations hadoop, MapR, CDH? >> &

Re: Spark + Kafka

2015-03-19 Thread James King

ealed no issues . > > - khanderao > > > > > On Mar 18, 2015, at 2:38 AM, James King wrote: > > > > Hi All, > > > > Which build of Spark is best when using Kafka? > > > > Regards > > jk >

Writing Spark Streaming Programs

2015-03-19 Thread James King

Hello All, I'm using Spark for streaming but I'm unclear one which implementation language to use Java, Scala or Python. I don't know anything about Python, familiar with Scala and have been doing Java for a long time. I think the above shouldn't influence my decision on which language to use be

Re: Writing Spark Streaming Programs

2015-03-19 Thread James King

ee a > good part of it, but recognize that it can keep the most complex Scala > constructions out of your code) > > > > On Thu, Mar 19, 2015 at 3:50 PM, James King wrote: > >> Hello All, >> >> I'm using Spark for streaming but I'm unclear one

Re: Spark + Kafka

2015-03-19 Thread James King

Many thanks all for the good responses, appreciated. On Thu, Mar 19, 2015 at 8:36 AM, James King wrote: > Thanks Khanderao. > > On Wed, Mar 18, 2015 at 7:18 PM, Khanderao Kand Gmail < > khanderao.k...@gmail.com> wrote: > >> I have used various version of spark (1.0

NetwrokWordCount + Spark standalone

2015-03-25 Thread James King

I'm trying to run the Java NetwrokWordCount example against a simple spark standalone runtime of one master and one worker. But it doesn't seem to work, the text entered on the Netcat data server is not being picked up and printed to Eclispe console output. However if I use conf.setMaster("local

Re: NetwrokWordCount + Spark standalone

2015-03-25 Thread James King

have minimum of 2 cores, 1 for receiving > your data and the other for processing. So when you say local[2] it > basically initialize 2 threads on your local machine, 1 for receiving data > from network and the other for your word count processing. > > Thanks > Best Regards > &

Spark + Kafka

2015-04-01 Thread James King

I have a simple setup/runtime of Kafka and Sprak. I have a command line consumer displaying arrivals to Kafka topic. So i know messages are being received. But when I try to read from Kafka topic I get no messages, here are some logs below. I'm thinking there aren't enough threads. How do i chec

Re: Spark + Kafka

2015-04-01 Thread James King

e make sure that you have given more cores than Receiver numbers. > > > > > *From:* James King > *Date:* 2015-04-01 15:21 > *To:* user > *Subject:* Spark + Kafka > I have a simple setup/runtime of Kafka and Sprak. > > I have a command line consumer displaying arriv

Re: Spark + Kafka

2015-04-01 Thread James King

t Spark > Streaming is keeping receiving data from sources like Kafka. > > > 2015-04-01 16:18 GMT+08:00 James King : > >> Thank you bit1129, >> >> From looking at the web UI i can see 2 cores >> >> Also looking at http://spark.apache.org/docs/1.2.1/configu

Re: Spark + Kafka

2015-04-01 Thread James King

.getSimpleName()) .setMaster(master); JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(duration)); return ssc; } On Wed, Apr 1, 2015 at 11:37 AM, James King wrote: > Thanks Saisai, > > Sure will do. > > But just a quick note that when i set master as &q

A stream of json objects using Java

2015-04-02 Thread James King

I'm reading a stream of string lines that are in json format. I'm using Java with Spark. Is there a way to get this from a transformation? so that I end up with a stream of JSON objects. I would also welcome any feedback about this approach or alternative approaches. thanks jk

Spark Cluster: RECEIVED SIGNAL 15: SIGTERM

2015-04-13 Thread James King

Any idea what this means, many thanks ==> logs/spark-.-org.apache.spark.deploy.worker.Worker-1-09.out.1 <== 15/04/13 07:07:22 INFO Worker: Starting Spark worker 09:39910 with 4 cores, 6.6 GB RAM 15/04/13 07:07:22 INFO Worker: Running Spark version 1.3.0 15/04/13 07:07:22 INFO Worke

Spark Directed Acyclic Graph / Jobs

2015-04-17 Thread James King

Is there a good resource that explains how Spark jobs gets broken down to tasks and executions. I just need to get a better understanding of this. Regards j

Re: Spark Directed Acyclic Graph / Jobs

2015-04-17 Thread James King

, also the > paper of Dryad is also a good one. > > > > Thanks > > Jerry > > > > *From:* James King [mailto:jakwebin...@gmail.com] > *Sent:* Friday, April 17, 2015 3:26 PM > *To:* user > *Subject:* Spark Directed Acyclic Graph / Jobs > > > >

Skipped Jobs

2015-04-19 Thread James King

In the web ui i can see some jobs as 'skipped' what does that mean? why are these jobs skipped? do they ever get executed? Regards jk

Spark Unit Testing

2015-04-21 Thread James King

I'm trying to write some unit tests for my spark code. I need to pass a JavaPairDStream to my spark class. Is there a way to create a JavaPairDStream using Java API? Also is there a good resource that covers an approach (or approaches) for unit testing using Java. Regards jk

Re: Spark Unit Testing

2015-04-21 Thread James King

ming > > - > http://www.slideshare.net/databricks/strata-sj-everyday-im-shuffling-tips-for-writing-better-spark-programs > > -- > Emre Sevinç > http://www.bigindustries.be/ > > > On Tue, Apr 21, 2015 at 1:26 PM, James King wrote: > >> I'm trying to write some unit t

Auto Starting a Spark Job on Cluster Starup

2015-04-22 Thread James King

What's the best way to start-up a spark job as part of starting-up the Spark cluster. I have an single uber jar for my job and want to make the start-up as easy as possible. Thanks jk

Master <-chatter -> Worker

2015-04-22 Thread James King

Is there a good resource that covers what kind of chatter (communication) that goes on between driver, master and worker processes? Thanks

Spark Cluster Setup

2015-04-24 Thread James King

I'm trying to find out how to setup a resilient Spark cluster. Things I'm thinking about include: - How to start multiple masters on different hosts? - there isn't a conf/masters file from what I can see Thank you.

Re: Spark Cluster Setup

2015-04-24 Thread James King

http://typesafe.com> > @deanwampler <http://twitter.com/deanwampler> > http://polyglotprogramming.com > > On Fri, Apr 24, 2015 at 5:01 AM, James King wrote: > >> I'm trying to find out how to setup a resilient Spark cluster. >> >> Things I'm thinking about i

Querying Cluster State

2015-04-26 Thread James King

If I have 5 nodes and I wish to maintain 1 Master and 2 Workers on each node, so in total I will have 5 master and 10 Workers. Now to maintain that setup I would like to query spark regarding the number Masters and Workers that are currently available using API calls and then take some appropriate

Re: Querying Cluster State

2015-04-26 Thread James King

documentation thoroughly. > > Best > Ayan > > On Sun, Apr 26, 2015 at 6:31 PM, James King wrote: > >> If I have 5 nodes and I wish to maintain 1 Master and 2 Workers on each >> node, so in total I will have 5 master and 10 Workers. >> >> Now to maintain

Re: Querying Cluster State

2015-04-26 Thread James King

change unexpectedly > between versions, but you might find it helpful. > > Nick > > On Sun, Apr 26, 2015 at 9:46 AM michal.klo...@gmail.com > < > michal.klo...@gmail.com > > wrote: > >> Not sure if there's a spark native way but we've been using consul for

spark-defaults.conf

2015-04-27 Thread James King

I renamed spark-defaults.conf.template to spark-defaults.conf and invoked spark-1.3.0-bin-hadoop2.4/sbin/start-slave.sh But I still get failed to launch org.apache.spark.deploy.worker.Worker: --properties-file FILE Path to a custom Spark properties file. Defaul

Re: spark-defaults.conf

2015-04-27 Thread James King

IR. > > On Mon, Apr 27, 2015 at 12:56 PM James King wrote: > >> I renamed spark-defaults.conf.template to spark-defaults.conf >> and invoked >> >> spark-1.3.0-bin-hadoop2.4/sbin/start-slave.sh >> >> But I still get >> >> failed to launch org.apa

Re: spark-defaults.conf

2015-04-28 Thread James King

explicitly Shouldn't Spark just consult with ZK and us the active master? Or is ZK only used during failure? On Mon, Apr 27, 2015 at 1:53 PM, James King wrote: > Thanks. > > I've set SPARK_HOME and SPARK_CONF_DIR appropriately in .bash_profile > > But when I start worker

submitting to multiple masters

2015-04-28 Thread James King

I have multiple masters running and I'm trying to submit an application using spark-1.3.0-bin-hadoop2.4/bin/spark-submit with this config (i.e. a comma separated list of master urls) --master spark://master01:7077,spark://master02:7077 But getting this exception Exceptio

Re: submitting to multiple masters

2015-04-28 Thread James King

one.html#standby-masters-with-zookeeper > > Thanks > M > > > On Apr 28, 2015, at 8:13 AM, James King wrote: > > I have multiple masters running and I'm trying to submit an application > using > > spark-1.3.0-bin-hadoop2.4/bin/spark-submit &g

Enabling Event Log

2015-04-29 Thread James King

I'm unclear why I'm getting this exception. It seems to have realized that I want to enable Event Logging but ignoring where I want it to log to i.e. file:/opt/cb/tmp/spark-events which does exist. spark-default.conf # Example: spark.master spark://master1:7077,master2:7077

Re: Enabling Event Log

2015-05-01 Thread James King

ntLog.dir", it will be "/tmp/spark-events". And this folder does > not exits. > > Best Regards, > Shixiong Zhu > > 2015-04-29 23:22 GMT-07:00 James King : > > I'm unclear why I'm getting this exception. >> >> It seems to have realized that

Receiver Fault Tolerance

2015-05-06 Thread James King

In the O'reilly book Learning Spark Chapter 10 section 24/7 Operation It talks about 'Receiver Fault Tolerance' I'm unsure of what a Receiver is here, from reading it sounds like when you submit an application to the cluster in cluster mode i.e. *--deploy-mode cluster *the driver program will run

Re: Receiver Fault Tolerance

2015-05-06 Thread James King

Many thanks all, your responses have been very helpful. Cheers On Wed, May 6, 2015 at 2:14 PM, ayan guha wrote: > > https://spark.apache.org/docs/latest/streaming-programming-guide.html#fault-tolerance-semantics > > > On Wed, May 6, 2015 at 10:09 PM, James King wrote: > >&

Stop Cluster Mode Running App

2015-05-06 Thread James King

I submitted a Spark Application in cluster mode and now every time I stop the cluster and restart it the job resumes execution. I even killed a daemon called DriverWrapper it stops the app but it resumes again. How can stop this application from running?

Re: Stop Cluster Mode Running App

2015-05-07 Thread James King

an use the kill command in spark-submit to > shut it down. You’ll need the driver id from the Spark UI or from when you > submitted the app. > > spark-submit --master spark://master:7077 --kill > > Thanks, > Silvio > > From: James King > Date: Wednesday, May 6, 2015 a

Submit Spark application in cluster mode and supervised

2015-05-08 Thread James King

I have two hosts host01 and host02 (lets call them) I run one Master and two Workers on host01 I also run one Master and two Workers on host02 Now I have 1 LIVE Master on host01 and a STANDBY Master on host02 The LIVE Master is aware of all Workers in the cluster Now I submit a Spark application

Re: Submit Spark application in cluster mode and supervised

2015-05-08 Thread James King

BTW I'm using Spark 1.3.0. Thanks On Fri, May 8, 2015 at 5:22 PM, James King wrote: > I have two hosts host01 and host02 (lets call them) > > I run one Master and two Workers on host01 > I also run one Master and two Workers on host02 > > Now I have 1 LIVE Master on host

Cluster mode and supervised app with multiple Masters

2015-05-08 Thread James King

Why does this not work ./spark-1.3.0-bin-hadoop2.4/bin/spark-submit --class SomeApp --deploy-mode cluster --supervise --master spark://host01:7077,host02:7077 Some.jar With exception: Caused by: java.lang.NumberFormatException: For input string: "7077,host02:7077" It seems to accept only one ma

Re: Submit Spark application in cluster mode and supervised

2015-05-09 Thread James King

eper then you should set > your master URL to be > > spark://host01:7077,host02:7077 > > And the property spark.deploy.recoveryMode=ZOOKEEPER > > See here for more info: > http://spark.apache.org/docs/latest/spark-standalone.html#standby-masters-with-zookeeper > >

Master HA

2015-05-12 Thread James King

I know that it is possible to use Zookeeper and File System (not for production use) to achieve HA. Are there any other options now or in the near future?

Re: Master HA

2015-05-12 Thread James King

Thanks Akhil, I'm using Spark in standalone mode so i guess Mesos is not an option here. On Tue, May 12, 2015 at 1:27 PM, Akhil Das wrote: > Mesos has a HA option (of course it includes zookeeper) > > Thanks > Best Regards > > On Tue, May 12, 2015 at 4:53 PM, James K

Reading Real Time Data only from Kafka

2015-05-12 Thread James King

What I want is if the driver dies for some reason and it is restarted I want to read only messages that arrived into Kafka following the restart of the driver program and re-connection to Kafka. Has anyone done this? any links or resources that can help explain this? Regards jk

Re: Reading Real Time Data only from Kafka

2015-05-12 Thread James King

est Regards > > On Tue, May 12, 2015 at 5:15 PM, James King wrote: > >> What I want is if the driver dies for some reason and it is restarted I >> want to read only messages that arrived into Kafka following the restart of >> the driver program and re-connection to Ka

Re: Reading Real Time Data only from Kafka

2015-05-12 Thread James King

y-once/blob/master/blogpost.md > > If for some reason you're stuck using an earlier version of spark, you can > accomplish what you want simply by starting the job using a new consumer > group (there will be no prior state in zookeeper, so it will start > consuming according to aut

Re: Reading Real Time Data only from Kafka

2015-05-12 Thread James King

May 12, 2015 at 9:01 AM, James King wrote: > >> Thanks Cody. >> >> Here are the events: >> >> - Spark app connects to Kafka first time and starts consuming >> - Messages 1 - 10 arrive at Kafka then Spark app gets them >> - Now driver dies >> - Messa

Kafka Direct Approach + Zookeeper

2015-05-13 Thread James King

From: http://spark.apache.org/docs/latest/streaming-kafka-integration.html I'm trying to use the direct approach to read messages form Kafka. Kafka is running as a cluster and configured with Zookeeper. On the above page it mentions: "In the Kafka parameters, you must specify either *metadata.

Kafka + Direct + Zookeeper

2015-05-13 Thread James King

I'm trying Kafka Direct approach (for consume) but when I use only this config: kafkaParams.put("group.id", groupdid); kafkaParams.put("zookeeper.connect", zookeeperHostAndPort + "/cb_kafka"); I get this Exception in thread "main" org.apache.spark.SparkException: Must specify metadata.broker.lis

Re: Kafka Direct Approach + Zookeeper

2015-05-13 Thread James King

okers in pre-existing Kafka > project apis. I don't know why the Kafka project chose to use 2 different > configuration keys. > > On Wed, May 13, 2015 at 5:00 AM, James King wrote: > >> From: >> http://spark.apache.org/docs/latest/streaming-kafka-integration.html &g

Re: Kafka Direct Approach + Zookeeper

2015-05-13 Thread James King

Looking at Consumer Configs in http://kafka.apache.org/documentation.html#consumerconfigs The properties *metadata.broker.list* or *bootstrap.servers *are not mentioned. Should I need these for consume side? On Wed, May 13, 2015 at 3:52 PM, James King wrote: > Many thanks Cody

Re: Kafka Direct Approach + Zookeeper

2015-05-13 Thread James King

ood to go. > > > On Wed, May 13, 2015 at 9:03 AM, James King wrote: > >> Looking at Consumer Configs in >> http://kafka.apache.org/documentation.html#consumerconfigs >> >> The properties *metadata.broker.list* or *bootstrap.servers *are not >> mention

Worker Spark Port

2015-05-13 Thread James King

I understated that this port value is randomly selected. Is there a way to enforce which spark port a Worker should use?

Re: Worker Spark Port

2015-05-13 Thread James King

Indeed, many thanks. On Wednesday, 13 May 2015, Cody Koeninger wrote: > I believe most ports are configurable at this point, look at > > http://spark.apache.org/docs/latest/configuration.html > > search for ".port" > > On Wed, May 13, 2015 at 9:38 AM, James Ki

Re: Worker Spark Port

2015-05-15 Thread James King

ue (to use) to the worker when it hands it a task to do? Also the property spark.executor.port is different from the Worker spark port, how can I make the Worker run on a specific port? Regards jk On Wed, May 13, 2015 at 7:51 PM, James King wrote: > Indeed, many thanks. > > > On

Re: Worker Spark Port

2015-05-15 Thread James King

you modify executor properties through a context. > > So, master != driver and executor != worker. > > Best > Ayan > > On Fri, May 15, 2015 at 7:52 PM, James King wrote: > >> So I'm using code like this to use specific ports: >> >> val con

58 matches

Mail list logo