I have two hosts 192.168.1.15 (Master) and 192.168.1.16 (Worker)
These two hosts have exchanged public keys so they have free access to each
other.
But when I do /sbin/start-all.sh from 192.168.1.15 I still get
192.168.1.16: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
any though
Thanks Akhil, yes that works fine it just lets me straight in.
On Mon, Jun 8, 2015 at 11:58 AM, Akhil Das
wrote:
> Can you do *ssh -v 192.168.1.16* from the Master machine and make sure
> its able to login without password?
>
> Thanks
> Best Regards
>
> On Mon, Jun 8, 2
Hi All,
Which build of Spark is best when using Kafka?
Regards
jk
se "Reply to all". If you're not including the mailing
> list in the response, I'm the only one who will get your message.
>
> Regards,
> Jeff
>
> 2015-03-18 10:49 GMT+01:00 James King :
>
>> Any sub-category recommendations hadoop, MapR, CDH?
>>
&
ealed no issues .
>
> - khanderao
>
>
>
> > On Mar 18, 2015, at 2:38 AM, James King wrote:
> >
> > Hi All,
> >
> > Which build of Spark is best when using Kafka?
> >
> > Regards
> > jk
>
Hello All,
I'm using Spark for streaming but I'm unclear one which implementation
language to use Java, Scala or Python.
I don't know anything about Python, familiar with Scala and have been doing
Java for a long time.
I think the above shouldn't influence my decision on which language to use
be
ee a
> good part of it, but recognize that it can keep the most complex Scala
> constructions out of your code)
>
>
>
> On Thu, Mar 19, 2015 at 3:50 PM, James King wrote:
>
>> Hello All,
>>
>> I'm using Spark for streaming but I'm unclear one
Many thanks all for the good responses, appreciated.
On Thu, Mar 19, 2015 at 8:36 AM, James King wrote:
> Thanks Khanderao.
>
> On Wed, Mar 18, 2015 at 7:18 PM, Khanderao Kand Gmail <
> khanderao.k...@gmail.com> wrote:
>
>> I have used various version of spark (1.0
I'm trying to run the Java NetwrokWordCount example against a simple spark
standalone runtime of one master and one worker.
But it doesn't seem to work, the text entered on the Netcat data server is
not being picked up and printed to Eclispe console output.
However if I use conf.setMaster("local
have minimum of 2 cores, 1 for receiving
> your data and the other for processing. So when you say local[2] it
> basically initialize 2 threads on your local machine, 1 for receiving data
> from network and the other for your word count processing.
>
> Thanks
> Best Regards
>
&
I have a simple setup/runtime of Kafka and Sprak.
I have a command line consumer displaying arrivals to Kafka topic. So i
know messages are being received.
But when I try to read from Kafka topic I get no messages, here are some
logs below.
I'm thinking there aren't enough threads. How do i chec
e make sure that you have given more cores than Receiver numbers.
>
>
>
>
> *From:* James King
> *Date:* 2015-04-01 15:21
> *To:* user
> *Subject:* Spark + Kafka
> I have a simple setup/runtime of Kafka and Sprak.
>
> I have a command line consumer displaying arriv
t Spark
> Streaming is keeping receiving data from sources like Kafka.
>
>
> 2015-04-01 16:18 GMT+08:00 James King :
>
>> Thank you bit1129,
>>
>> From looking at the web UI i can see 2 cores
>>
>> Also looking at http://spark.apache.org/docs/1.2.1/configu
.getSimpleName())
.setMaster(master);
JavaStreamingContext ssc = new JavaStreamingContext(sparkConf,
Durations.seconds(duration));
return ssc;
}
On Wed, Apr 1, 2015 at 11:37 AM, James King wrote:
> Thanks Saisai,
>
> Sure will do.
>
> But just a quick note that when i set master as &q
I'm reading a stream of string lines that are in json format.
I'm using Java with Spark.
Is there a way to get this from a transformation? so that I end up with a
stream of JSON objects.
I would also welcome any feedback about this approach or alternative
approaches.
thanks
jk
Any idea what this means, many thanks
==>
logs/spark-.-org.apache.spark.deploy.worker.Worker-1-09.out.1
<==
15/04/13 07:07:22 INFO Worker: Starting Spark worker 09:39910 with 4
cores, 6.6 GB RAM
15/04/13 07:07:22 INFO Worker: Running Spark version 1.3.0
15/04/13 07:07:22 INFO Worke
Is there a good resource that explains how Spark jobs gets broken down to
tasks and executions.
I just need to get a better understanding of this.
Regards
j
, also the
> paper of Dryad is also a good one.
>
>
>
> Thanks
>
> Jerry
>
>
>
> *From:* James King [mailto:jakwebin...@gmail.com]
> *Sent:* Friday, April 17, 2015 3:26 PM
> *To:* user
> *Subject:* Spark Directed Acyclic Graph / Jobs
>
>
>
>
In the web ui i can see some jobs as 'skipped' what does that mean? why are
these jobs skipped? do they ever get executed?
Regards
jk
I'm trying to write some unit tests for my spark code.
I need to pass a JavaPairDStream to my spark class.
Is there a way to create a JavaPairDStream using Java API?
Also is there a good resource that covers an approach (or approaches) for
unit testing using Java.
Regards
jk
ming
>
> -
> http://www.slideshare.net/databricks/strata-sj-everyday-im-shuffling-tips-for-writing-better-spark-programs
>
> --
> Emre Sevinç
> http://www.bigindustries.be/
>
>
> On Tue, Apr 21, 2015 at 1:26 PM, James King wrote:
>
>> I'm trying to write some unit t
What's the best way to start-up a spark job as part of starting-up the
Spark cluster.
I have an single uber jar for my job and want to make the start-up as easy
as possible.
Thanks
jk
Is there a good resource that covers what kind of chatter (communication)
that goes on between driver, master and worker processes?
Thanks
I'm trying to find out how to setup a resilient Spark cluster.
Things I'm thinking about include:
- How to start multiple masters on different hosts?
- there isn't a conf/masters file from what I can see
Thank you.
http://typesafe.com>
> @deanwampler <http://twitter.com/deanwampler>
> http://polyglotprogramming.com
>
> On Fri, Apr 24, 2015 at 5:01 AM, James King wrote:
>
>> I'm trying to find out how to setup a resilient Spark cluster.
>>
>> Things I'm thinking about i
If I have 5 nodes and I wish to maintain 1 Master and 2 Workers on each
node, so in total I will have 5 master and 10 Workers.
Now to maintain that setup I would like to query spark regarding the number
Masters and Workers that are currently available using API calls and then
take some appropriate
documentation thoroughly.
>
> Best
> Ayan
>
> On Sun, Apr 26, 2015 at 6:31 PM, James King wrote:
>
>> If I have 5 nodes and I wish to maintain 1 Master and 2 Workers on each
>> node, so in total I will have 5 master and 10 Workers.
>>
>> Now to maintain
change unexpectedly
> between versions, but you might find it helpful.
>
> Nick
>
> On Sun, Apr 26, 2015 at 9:46 AM michal.klo...@gmail.com
> <
> michal.klo...@gmail.com
> > wrote:
>
>> Not sure if there's a spark native way but we've been using consul for
I renamed spark-defaults.conf.template to spark-defaults.conf
and invoked
spark-1.3.0-bin-hadoop2.4/sbin/start-slave.sh
But I still get
failed to launch org.apache.spark.deploy.worker.Worker:
--properties-file FILE Path to a custom Spark properties file.
Defaul
IR.
>
> On Mon, Apr 27, 2015 at 12:56 PM James King wrote:
>
>> I renamed spark-defaults.conf.template to spark-defaults.conf
>> and invoked
>>
>> spark-1.3.0-bin-hadoop2.4/sbin/start-slave.sh
>>
>> But I still get
>>
>> failed to launch org.apa
explicitly
Shouldn't Spark just consult with ZK and us the active master?
Or is ZK only used during failure?
On Mon, Apr 27, 2015 at 1:53 PM, James King wrote:
> Thanks.
>
> I've set SPARK_HOME and SPARK_CONF_DIR appropriately in .bash_profile
>
> But when I start worker
I have multiple masters running and I'm trying to submit an application
using
spark-1.3.0-bin-hadoop2.4/bin/spark-submit
with this config (i.e. a comma separated list of master urls)
--master spark://master01:7077,spark://master02:7077
But getting this exception
Exceptio
one.html#standby-masters-with-zookeeper
>
> Thanks
> M
>
>
> On Apr 28, 2015, at 8:13 AM, James King wrote:
>
> I have multiple masters running and I'm trying to submit an application
> using
>
> spark-1.3.0-bin-hadoop2.4/bin/spark-submit
&g
I'm unclear why I'm getting this exception.
It seems to have realized that I want to enable Event Logging but ignoring
where I want it to log to i.e. file:/opt/cb/tmp/spark-events which does
exist.
spark-default.conf
# Example:
spark.master spark://master1:7077,master2:7077
ntLog.dir", it will be "/tmp/spark-events". And this folder does
> not exits.
>
> Best Regards,
> Shixiong Zhu
>
> 2015-04-29 23:22 GMT-07:00 James King :
>
> I'm unclear why I'm getting this exception.
>>
>> It seems to have realized that
In the O'reilly book Learning Spark Chapter 10 section 24/7 Operation
It talks about 'Receiver Fault Tolerance'
I'm unsure of what a Receiver is here, from reading it sounds like when you
submit an application to the cluster in cluster mode i.e. *--deploy-mode
cluster *the driver program will run
Many thanks all, your responses have been very helpful. Cheers
On Wed, May 6, 2015 at 2:14 PM, ayan guha wrote:
>
> https://spark.apache.org/docs/latest/streaming-programming-guide.html#fault-tolerance-semantics
>
>
> On Wed, May 6, 2015 at 10:09 PM, James King wrote:
>
>&
I submitted a Spark Application in cluster mode and now every time I stop
the cluster and restart it the job resumes execution.
I even killed a daemon called DriverWrapper it stops the app but it resumes
again.
How can stop this application from running?
an use the kill command in spark-submit to
> shut it down. You’ll need the driver id from the Spark UI or from when you
> submitted the app.
>
> spark-submit --master spark://master:7077 --kill
>
> Thanks,
> Silvio
>
> From: James King
> Date: Wednesday, May 6, 2015 a
I have two hosts host01 and host02 (lets call them)
I run one Master and two Workers on host01
I also run one Master and two Workers on host02
Now I have 1 LIVE Master on host01 and a STANDBY Master on host02
The LIVE Master is aware of all Workers in the cluster
Now I submit a Spark application
BTW I'm using Spark 1.3.0.
Thanks
On Fri, May 8, 2015 at 5:22 PM, James King wrote:
> I have two hosts host01 and host02 (lets call them)
>
> I run one Master and two Workers on host01
> I also run one Master and two Workers on host02
>
> Now I have 1 LIVE Master on host
Why does this not work
./spark-1.3.0-bin-hadoop2.4/bin/spark-submit --class SomeApp --deploy-mode
cluster --supervise --master spark://host01:7077,host02:7077 Some.jar
With exception:
Caused by: java.lang.NumberFormatException: For input string:
"7077,host02:7077"
It seems to accept only one ma
eper then you should set
> your master URL to be
>
> spark://host01:7077,host02:7077
>
> And the property spark.deploy.recoveryMode=ZOOKEEPER
>
> See here for more info:
> http://spark.apache.org/docs/latest/spark-standalone.html#standby-masters-with-zookeeper
>
>
I know that it is possible to use Zookeeper and File System (not for
production use) to achieve HA.
Are there any other options now or in the near future?
Thanks Akhil,
I'm using Spark in standalone mode so i guess Mesos is not an option here.
On Tue, May 12, 2015 at 1:27 PM, Akhil Das
wrote:
> Mesos has a HA option (of course it includes zookeeper)
>
> Thanks
> Best Regards
>
> On Tue, May 12, 2015 at 4:53 PM, James K
What I want is if the driver dies for some reason and it is restarted I
want to read only messages that arrived into Kafka following the restart of
the driver program and re-connection to Kafka.
Has anyone done this? any links or resources that can help explain this?
Regards
jk
est Regards
>
> On Tue, May 12, 2015 at 5:15 PM, James King wrote:
>
>> What I want is if the driver dies for some reason and it is restarted I
>> want to read only messages that arrived into Kafka following the restart of
>> the driver program and re-connection to Ka
y-once/blob/master/blogpost.md
>
> If for some reason you're stuck using an earlier version of spark, you can
> accomplish what you want simply by starting the job using a new consumer
> group (there will be no prior state in zookeeper, so it will start
> consuming according to aut
May 12, 2015 at 9:01 AM, James King wrote:
>
>> Thanks Cody.
>>
>> Here are the events:
>>
>> - Spark app connects to Kafka first time and starts consuming
>> - Messages 1 - 10 arrive at Kafka then Spark app gets them
>> - Now driver dies
>> - Messa
From: http://spark.apache.org/docs/latest/streaming-kafka-integration.html
I'm trying to use the direct approach to read messages form Kafka.
Kafka is running as a cluster and configured with Zookeeper.
On the above page it mentions:
"In the Kafka parameters, you must specify either *metadata.
I'm trying Kafka Direct approach (for consume) but when I use only this
config:
kafkaParams.put("group.id", groupdid);
kafkaParams.put("zookeeper.connect", zookeeperHostAndPort + "/cb_kafka");
I get this
Exception in thread "main" org.apache.spark.SparkException: Must specify
metadata.broker.lis
okers in pre-existing Kafka
> project apis. I don't know why the Kafka project chose to use 2 different
> configuration keys.
>
> On Wed, May 13, 2015 at 5:00 AM, James King wrote:
>
>> From:
>> http://spark.apache.org/docs/latest/streaming-kafka-integration.html
&g
Looking at Consumer Configs in
http://kafka.apache.org/documentation.html#consumerconfigs
The properties *metadata.broker.list* or *bootstrap.servers *are not
mentioned.
Should I need these for consume side?
On Wed, May 13, 2015 at 3:52 PM, James King wrote:
> Many thanks Cody
ood to go.
>
>
> On Wed, May 13, 2015 at 9:03 AM, James King wrote:
>
>> Looking at Consumer Configs in
>> http://kafka.apache.org/documentation.html#consumerconfigs
>>
>> The properties *metadata.broker.list* or *bootstrap.servers *are not
>> mention
I understated that this port value is randomly selected.
Is there a way to enforce which spark port a Worker should use?
Indeed, many thanks.
On Wednesday, 13 May 2015, Cody Koeninger wrote:
> I believe most ports are configurable at this point, look at
>
> http://spark.apache.org/docs/latest/configuration.html
>
> search for ".port"
>
> On Wed, May 13, 2015 at 9:38 AM, James Ki
ue (to use) to the worker when it hands it a task to do?
Also the property spark.executor.port is different from the Worker
spark port, how can I make the Worker run on a specific port?
Regards
jk
On Wed, May 13, 2015 at 7:51 PM, James King wrote:
> Indeed, many thanks.
>
>
> On
you modify executor properties through a context.
>
> So, master != driver and executor != worker.
>
> Best
> Ayan
>
> On Fri, May 15, 2015 at 7:52 PM, James King wrote:
>
>> So I'm using code like this to use specific ports:
>>
>> val con
58 matches
Mail list logo