We have a similar setup (EKS/S3) and use promtail to collect pod logs to
loki.
We haven't tried to get the history UI log links working. Instead we link
to both the history server and logs from the same job/cluster overview
dashboards in grafana.
On Wed, Oct 23, 2024 at 3:36 PM karan alang wrote
Hello All,
I have kubeflow spark operator installed on GKE (in namespace - so350), as
well as Spark History Server installed on GKE in namespace shs-350.
The spark job is launched in a separate namespaces - spark-apps.
When I launch the spark job, it runs fine and I'm able to see the job
details i
In a nutshell, the culprit for the OOM issue in your Spark driver appears
to be memory leakage or inefficient memory usage within your application.
This could be caused by factors such as:
1. Accumulation of data or objects in memory over time without proper
cleanup.
2. Inefficient data
Hey, do you perform stateful operations? Maybe your state is growing
indefinitely - a screenshot with state metrics would help (you can find it
in Spark UI -> Structured Streaming -> your query). Do you have a
driver-only cluster or do you have workers too? What's the memory usage
profile at worker
Hi All,
I am using the pyspark structure streaming with Azure Databricks for data
load process.
In the Pipeline I am using a Job cluster and I am running only one
pipeline, I am getting the OUT OF MEMORY issue while running for a
long time. When I inspect the metrics of the cluster I found that,
user@spark.apache.org
*Subject:* Re: Spark driver thread
Hi james,
You can configure the Spark Driver to use more than a single thread.
It is something that depends on the application, but the Spark driver
can take advantage of multiple threads in many situations. For
instance, when the d
------
>> *From:* Pol Santamaria
>> *Sent:* Friday, March 6, 2020 12:59 AM
>> *To:* James Yu
>> *Cc:* user@spark.apache.org
>> *Subject:* Re: Spark driver thread
>>
>> Hi james,
>>
>> You can configure the Spark Driver to
id still
> applicable in cluster mode. Thanks in advance for your further
> clarification.
>
> --
> *From:* Pol Santamaria
> *Sent:* Friday, March 6, 2020 12:59 AM
> *To:* James Yu
> *Cc:* user@spark.apache.org
> *Subject:* Re: Spark driver th
@spark.apache.org
Subject: Re: Spark driver thread
Hi james,
You can configure the Spark Driver to use more than a single thread. It is
something that depends on the application, but the Spark driver can take
advantage of multiple threads in many situations. For instance, when the driver
Hi james,
You can configure the Spark Driver to use more than a single thread. It is
something that depends on the application, but the Spark driver can take
advantage of multiple threads in many situations. For instance, when the
driver program gathers or sends data to the workers.
So yes, if
Hi,
Does a Spark driver always works as single threaded?
If yes, does it mean asking for more than one vCPU for the driver is wasteful?
Thanks,
James
Hi all,
Recently, our Spark application's (2.3.1) driver has been crashing before
exiting with the following error.
* Could not load hsdis-amd64.so; library not loadable; PrintAssembly is
disabled*
* #*
* # A fatal error has been detected by the Java Runtime Environment:*
* #*
* # Internal Error
-- Forwarded message -
From: prudhvi ch
Date: Thu, Jan 31, 2019, 5:54 PM
Subject: Fwd: Spark driver pod scheduling fails on auto scaled node
To:
-- Forwarded message -
From: Prudhvi Chennuru (CONT)
Date: Thu, Jan 31, 2019, 5:01 PM
Subject: Fwd: Spark driver
-- Forwarded message -
From: Prudhvi Chennuru (CONT)
Date: Thu, Jan 31, 2019, 5:01 PM
Subject: Fwd: Spark driver pod scheduling fails on auto scaled node
To:
Hi,
I am using kubernetes *v 1.11.5* and spark *v 2.3.0*,
*calico(daemonset)* as overlay network plugin and
Hi,
I am using kubernetes *v 1.11.5* and spark *v 2.3.0*,
*calico(daemonset)* as overlay network plugin and kubernetes *cluster auto
scalar* feature to autoscale cluster if needed. When the cluster is auto
scaling calico pods are scheduling on those nodes but they are not ready
for 40 to 50 se
I got this error from spark driver, it seems that I should increase the
memory in the driver although it's 5g (and 4 cores) right now. It seems
weird to me because I'm not using Kryo or broadcast in this process but in
the log there are references to Kryo and broadcast.
How could I figu
Resurfacing The question to get more attention
Hello,
>
> im running Spark 2.3 job on kubernetes cluster
>>
>> kubectl version
>>
>> Client Version: version.Info{Major:"1", Minor:"9",
>> GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b",
>> GitTreeState:"clean", BuildDa
Hello,
im running Spark 2.3 job on kubernetes cluster
>
> kubectl version
>
> Client Version: version.Info{Major:"1", Minor:"9",
> GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b",
> GitTreeState:"clean", BuildDate:"2018-02-09T21:51:06Z",
> GoVersion:"go1.9.4", Compile
im running Spark 2.3 job on kubernetes cluster
kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3",
GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean",
BuildDate:"2018-02-09T21:51:06Z", GoVersion:"go1.9.4", Compiler:"gc",
Platform:"da
that can exist before the terminated pod garbage
collector starts deleting terminated pods. If <= 0, the terminated pod
garbage collector is disabled.
On Wed, May 23, 2018, 8:34 AM purna pradeep wrote:
> Hello,
>
> Currently I observe dead pods are not getting garbage collected (a
Hello,
Currently I observe dead pods are not getting garbage collected (aka spark
driver pods which have completed execution). So pods could sit in the
namespace for weeks potentially. This makes listing, parsing, and reading
pods slower and well as having junk sit on the cluster.
I believe
I think a pod disruption budget might actually work here. It can select the
spark driver pod using a label. Using that with a minAvailable value that's
appropriate here could do it.
In a more general sense, we do plan on some future work to support driver
recovery which should help long ru
Hi,
What would be the recommended approach to wait for spark driver pod to
complete the currently running job before it gets evicted to new nodes
while maintenance on the current node is goingon (kernel upgrade,hardware
maintenance etc..) using drain command
I don’t think I can use
so let's say I have chained path in
spark.driver.extraClassPath/spark.executor.extraClassPath such as
/path1/*:/path2/*, and I have different versions of the same jar under those 2
directories, how spark pick the version of jar to use, from /path1/*?
Thanks.
e.org
Subject: RE: Spark driver CPU usage
Does that configuration parameter affect the CPU usage of the driver? If it
does, we have that property unchanged from its default value of "1" yet the
same behaviour as before.
-Original Message-
From: Rohit Verma [mailto:rohit.ve...@roki
7 06:08
To: Phadnis, Varun
Cc: user@spark.apache.org
Subject: Re: Spark driver CPU usage
Use conf spark.task.cpus to control number of cpus to use in a task.
On Mar 1, 2017, at 5:41 PM, Phadnis, Varun wrote:
>
> Hello,
>
> Is there a way to control CPU usage for driver when running app
Use conf spark.task.cpus to control number of cpus to use in a task.
On Mar 1, 2017, at 5:41 PM, Phadnis, Varun wrote:
>
> Hello,
>
> Is there a way to control CPU usage for driver when running applications in
> client mode?
>
> Currently we are observing that the driver occupies all the co
Hello,
Is there a way to control CPU usage for driver when running applications in
client mode?
Currently we are observing that the driver occupies all the cores. Launching
just 3 instances of driver of WordCount sample application concurrently on the
same machine brings the usage of its 4 cor
.
>>
>> liangyihuai
>>
>> ---Original---
>> *From:* "Jacek Laskowski "
>> *Date:* 2017/2/25 02:45:20
>> *To:* "prithish";
>> *Cc:* "user";
>> *Subject:* Re: RDD blocks on Spark Driver
>>
>> Hi,
>>
&g
quot; is relative to spark.
>
> liangyihuai
>
> ---Original---
> *From:* "Jacek Laskowski "
> *Date:* 2017/2/25 02:45:20
> *To:* "prithish";
> *Cc:* "user";
> *Subject:* Re: RDD blocks on Spark Driver
>
> Hi,
>
> Guess you're use loca
which are local, standalone, yarn
> and Mesos. Also, "blocks" is relative to hdfs, "partitions"
> is relative to spark.
>
> liangyihuai
>
> ---Original---
> *From:* "Jacek Laskowski "
> *Date:* 2017/2/25 02:45:20
> *To:* "prithish";
> *
Hi, I think you are using the local model of Spark. There are mainly four models, which are local, standalone, yarn and Mesos. Also, "blocks" is relative to hdfs, "partitions" is relative to spark.liangyihuai---Original---From: "Jacek Laskowski "Date: 2017/2/25 02:45:20To: "prithish";Cc: "user";Su
Hi,
Guess you're use local mode which has only one executor called driver. Is
my guessing correct?
Jacek
On 23 Feb 2017 2:03 a.m., wrote:
> Hello,
>
> Had a question. When I look at the executors tab in Spark UI, I notice
> that some RDD blocks are assigned to the driver as well. Can someone p
Hello,
Had a question. When I look at the executors tab in Spark UI, I notice that
some RDD blocks are assigned to the driver as well. Can someone please tell me
why?
Thanks for the help.
Hi,
I use Spark Standalone cluster and I submitted my application in cluster
mode. When I go the Spark Master UI there is a table layout for "Running
Applications" and in that table there is a column called "Name" which has
the value of my app name and when I click on the link it doesn't work
beca
Corrosponding HBase bug: https://issues.apache.org/jira/browse/HBASE-12629
On Wed, Nov 23, 2016 at 1:55 PM, Mukesh Jha wrote:
> The solution is to disable region size caluculation check.
>
> hbase.regionsizecalculator.enable: false
>
> On Sun, Nov 20, 2016 at 9:29 PM, Mukesh Jha
> wrote:
>
>> A
The solution is to disable region size caluculation check.
hbase.regionsizecalculator.enable: false
On Sun, Nov 20, 2016 at 9:29 PM, Mukesh Jha wrote:
> Any ideas folks?
>
> On Fri, Nov 18, 2016 at 3:37 PM, Mukesh Jha
> wrote:
>
>> Hi
>>
>> I'm accessing multiple regions (~5k) of an HBase tabl
Any ideas folks?
On Fri, Nov 18, 2016 at 3:37 PM, Mukesh Jha wrote:
> Hi
>
> I'm accessing multiple regions (~5k) of an HBase table using spark's
> newAPIHadoopRDD. But the driver is trying to calculate the region size of
> all the regions.
> It is not even reusing the hconnection and creting a
Hi
I'm accessing multiple regions (~5k) of an HBase table using spark's
newAPIHadoopRDD. But the driver is trying to calculate the region size of
all the regions.
It is not even reusing the hconnection and creting a new connection for
every request (see below) which is taking lots of time.
Is the
40550 MRAppMaster(this is MR APP MASTER container)
*Spark Related processes:*
40602 SparkSubmit
40875 CoarseGrainedExecutorBackend
40846 CoarseGrainedExecutorBackend
40815 ExecutorLauncher
When Spark app is started via SparkLauncher#startApplication(), Spark
driver (inside SparkSubmit) is st
ve
>>>> attached jstack dump here. I do a simple MapToPair and reduceByKey and I
>>>> have a window Interval of 1 minute (6ms) and batch interval of 1s (
>>>> 1000) This is generating lot of threads atleast 5 to 8 threads per
>>>> second and the total number of threads is monotonically increasing. So just
>>>> for tweaking purpose I changed my window interval to 1min (6ms) and
>>>> batch interval of 10s (1) this looked lot better but still not
>>>> ideal at very least it is not monotonic anymore (It goes up and down). Now
>>>> my question really is how do I tune such that my number of threads are
>>>> optimal while satisfying the window Interval of 1 minute (6ms) and
>>>> batch interval of 1s (1000) ?
>>>>
>>>> This jstack dump is taken after running my spark driver program for 2
>>>> mins and there are about 1000 threads.
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>>
>>
>
e (6ms) and batch interval of 1s (
>>> 1000) This is generating lot of threads atleast 5 to 8 threads per
>>> second and the total number of threads is monotonically increasing. So just
>>> for tweaking purpose I changed my window interval to 1min (6ms) and
>>> batch interval of 10s (1) this looked lot better but still not
>>> ideal at very least it is not monotonic anymore (It goes up and down). Now
>>> my question really is how do I tune such that my number of threads are
>>> optimal while satisfying the window Interval of 1 minute (6ms) and
>>> batch interval of 1s (1000) ?
>>>
>>> This jstack dump is taken after running my spark driver program for 2
>>> mins and there are about 1000 threads.
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>
rval of 10s (1) this looked lot better but still not ideal
>> at very least it is not monotonic anymore (It goes up and down). Now my
>> question really is how do I tune such that my number of threads are
>> optimal while satisfying the window Interval of 1 minute (6ms) and
>> batch interval of 1s (1000) ?
>>
>> This jstack dump is taken after running my spark driver program for 2
>> mins and there are about 1000 threads.
>>
>> Thanks!
>>
>>
>>
>>
his looked lot better but still not ideal at
> very least it is not monotonic anymore (It goes up and down). Now my
> question really is how do I tune such that my number of threads are
> optimal while satisfying the window Interval of 1 minute (6ms) and
> batch interval of 1s (1000) ?
>
> This jstack dump is taken after running my spark driver program for 2 mins
> and there are about 1000 threads.
>
> Thanks!
>
>
>
>
e (It goes up and down). Now
>>> my question really is how do I tune such that my number of threads are
>>> optimal while satisfying the window Interval of 1 minute (6ms) and
>>> batch interval of 1s (1000) ?
>>>
>>> This jstack dump is taken after running my spark driver program for 2
>>> mins and there are about 1000 threads.
>>>
>>> Thanks!
>>>
>>
>
n). Now my
>> question really is how do I tune such that my number of threads are
>> optimal while satisfying the window Interval of 1 minute (6ms) and
>> batch interval of 1s (1000) ?
>>
>> This jstack dump is taken after running my spark driver program for 2
>> mins and there are about 1000 threads.
>>
>> Thanks!
>>
>
ms) and
> batch interval of 1s (1000) ?
>
> This jstack dump is taken after running my spark driver program for 2 mins
> and there are about 1000 threads.
>
> Thanks!
>
number of threads are
>> optimal while satisfying the window Interval of 1 minute (6ms) and
>> batch interval of 1s (1000) ?
>>
>> This jstack dump is taken after running my spark driver program for 2
>> mins and there are about 1000 threads.
>>
>> Thanks!
>
really is how do I tune such that my number of threads are
> optimal while satisfying the window Interval of 1 minute (6ms) and
> batch interval of 1s (1000) ?
>
> This jstack dump is taken after running my spark driver program for 2 mins
> and there are about 1000 threads.
&
>>>>>>>> that it doesn't spawn any other threads. It only calls MapToPair,
>>>>>>>> ReduceByKey, forEachRDD, Collect functions.
>>>>>>>>
>>>>>>>> public class NSQReceiver extends Receiver {
>&
t;>>>>
>>>>>>> public class NSQReceiver extends Receiver {
>>>>>>>
>>>>>>> private String topic="";
>>>>>>>
>>>>>>> public NSQReceiver(String topic) {
>>>>>
gt;>>>> super(StorageLevel.MEMORY_AND_DISK_2());
>>>>>> this.topic = topic;
>>>>>> }
>>>>>>
>>>>>> @Override
>>>>>> public void *onStart()* {
>>>>>>
= topic;
>>>>> }
>>>>>
>>>>> @Override
>>>>> public void *onStart()* {
>>>>> new Thread() {
>>>>> @Override public void run() {
>>>>> receive();
>>>
t;> new Thread() {
>>>> @Override public void run() {
>>>> receive();
>>>> }
>>>> }.start();
>>>> }
>>>>
>>>> }
>>>>
>>>>
>
; this.topic = topic;
> }
>
> @Override
> public void *onStart()* {
> new Thread() {
> @Override public void run() {
> receive();
> }
> }.start();
> }
>
> }
>
>
> Environment inf
}
>>> }.start();
>>> }
>>>
>>> }
>>>
>>>
>>> Environment info:
>>>
>>> Java 8
>>>
>>> Scala 2.11.8
>>>
>>> Spark 2.0.0
>>>
>>> More than happy to share
vironment info:
>>
>> Java 8
>>
>> Scala 2.11.8
>>
>> Spark 2.0.0
>>
>> More than happy to share any other info you may need.
>>
>>
>> On Mon, Oct 31, 2016 at 11:05 AM, Jakob Odersky
>> wrote:
>>
>>> > how
}
> }.start();
> }
>
> }
>
>
> Environment info:
>
> Java 8
>
> Scala 2.11.8
>
> Spark 2.0.0
>
> More than happy to share any other info you may need.
>
>
> On Mon, Oct 31, 2016 at 11:05 AM, Jakob Odersky wrote:
>
>> >
receive();
}
}.start();
}
}
Environment info:
Java 8
Scala 2.11.8
Spark 2.0.0
More than happy to share any other info you may need.
On Mon, Oct 31, 2016 at 11:05 AM, Jakob Odersky wrote:
> > how do I tell my spark driver program to not create so many?
>
> This may de
> how do I tell my spark driver program to not create so many?
This may depend on your driver program. Do you spawn any threads in
it? Could you share some more information on the driver program, spark
version and your environment? It would greatly help others to help you
On Mon, Oct 31, 2
unlimited
max user processes (-u) 120242
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
So at this point I do understand that the I am running out of memory due to
allocation of threads so my biggest question is how do I tell my spark
driver progra
> ps -elfT | grep "spark-driver-program.jar" | wc -l
>
> The result is around 32K. why does it create so many threads how can I
> limit this?
>
when I do
ps -elfT | grep "spark-driver-program.jar" | wc -l
The result is around 32K. why does it create so many threads how can I
limit this?
;Could not connect to server on %s' % nodes[amHost]
--
Masood Krohy, Ph.D.
Data Scientist, Intact Lab-R&D
Intact Financial Corporation
De :Steve Loughran
A : Masood Krohy
Cc :"user@spark.apache.org"
Date : 2016-10-24 17:09
Obj
On 24 Oct 2016, at 19:34, Masood Krohy
mailto:masood.kr...@intact.net>> wrote:
Hi everyone,
Is there a way to set the IP address/hostname that the Spark Driver is going to
be running on when launching a program through spark-submit in yarn-cluster
mode (PySpark 1.6.0)?
I do not
Hi everyone,
Is there a way to set the IP address/hostname that the Spark Driver is
going to be running on when launching a program through spark-submit in
yarn-cluster mode (PySpark 1.6.0)?
I do not see an option for this. If not, is there a way to get this IP
address after the Spark app has
Hi,
I alwayd underestimated the significant of setting spark.driver.memory
According to documents
It is the amount of memory to use for the driver process, i.e. where
SparkContext is initialized. (e.g. 1g, 2g).
I was running my application using Spark Standalone so the argument about
Local mode
Hi,
I'm running a job on Spark 1.5.2 and I get OutOfMemoryError on broadcast
variables access. The thing is I am not sure to understand why the
broadcast keeps growing and why it does at this place of code.
Basically, I have a large input file, each line having a key. I group by
key my lines to h
Saurav,
We have the same issue. Our application runs fine on 32 nodes with 4 cores
each and 256 partitions but gives an OOM on the driver when run on 64 nodes
with 512 partitions. Did you get to know the reason behind this behavior or
the relation between number of partitions and driver RAM usage?
Cache defaults to MEMORY_ONLY. Can you try with different storage levels
,i.e., MEMORY_ONLY_SER or even DISK_ONLY. you may want to use persist( )
instead of cache.
Or there is an experimental storage level OFF_HEAP which might also help.
On Tue, Jul 19, 2016 at 11:08 PM, Saurav Sinha
wrote:
> H
Hi,
I have set driver memory 10 GB and job ran with intermediate failure which
is recovered back by spark.
But I still what to know if no of parts increases git driver ram need to be
increased and what is ration of no of parts/RAM.
@RK : I am using cache on RDD. Is this reason of high RAM utiliz
Just want to see if this helps.
Are you doing heavy collects and persist that? If that is so, you might
want to parallelize that collection by converting to an RDD.
Thanks,
RK
On Tue, Jul 19, 2016 at 12:09 AM, Saurav Sinha
wrote:
> Hi Mich,
>
>1. In what mode are you running the spark stan
Hi Mich,
1. In what mode are you running the spark standalone, yarn-client, yarn
cluster etc
Ans: spark standalone
1. You have 4 nodes with each executor having 10G. How many actual
executors do you see in UI (Port 4040 by default)
Ans: There are 4 executor on which am using 8 cores
can you please clarify:
1. In what mode are you running the spark standalone, yarn-client, yarn
cluster etc
2. You have 4 nodes with each executor having 10G. How many actual
executors do you see in UI (Port 4040 by default)
3. What is master memory? Are you referring to diver memo
I have set --drive-memory 5g. I need to understand that as no of partition
increase drive-memory need to be increased. What will be best ration of No
of partition/drive-memory.
On Mon, Jul 18, 2016 at 4:07 PM, Zhiliang Zhu wrote:
> try to set --drive-memory xg , x would be as large as can be set
try to set --drive-memory xg , x would be as large as can be set .
On Monday, July 18, 2016 6:31 PM, Saurav Sinha
wrote:
Hi,
I am running spark job.
Master memory - 5Gexecutor memort 10G(running on 4 node)
My job is getting killed as no of partition increase to 20K.
16/07/18 14:53:13 I
Hi,
I am running spark job.
Master memory - 5G
executor memort 10G(running on 4 node)
My job is getting killed as no of partition increase to 20K.
16/07/18 14:53:13 INFO DAGScheduler: Got job 17 (foreachPartition at
WriteToKafka.java:45) with 13524 output partitions (allowLocal=false)
16/07/18
Hi, I have next issue:
I have zeppelin, which set up in yarn-client mode. Notebook in Running
state for long period of time with 0% done and I do not see any even
accepted application in yarn.
To be able to understand what's going on, I need logs of spark driver,
which is trying to conne
Hi Ted,
Perhaps this might help? Thanks for your response. I am trying to
access/read binary files stored over a series of servers.
Line used to build RDD:
val BIN_pairRDD: RDD[(BIN_Key, BIN_Value)] =
spark.newAPIHadoopFile("not.used", classOf[BIN_InputFormat],
classOf[BIN_Key], classOf[BIN_Valu
se I can check, or change, to force the driver to send these tasks
> to
> the right workers?
>
> Thanks!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-driver-assigning-splits-to-incorrect-workers-tp
can check, or change, to force the driver to send these tasks to
the right workers?
Thanks!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-driver-assigning-splits-to-incorrect-workers-tp27261.html
Sent from the Apache Spark User List mailing list
en
>>> evtl. beigefügter Dateien sowie die unbefugte Weitergabe dieser E-Mail ist
>>> nicht gestattet.
>>>
>>>> Am 28.06.2016 um 17:04 schrieb Mich Talebzadeh >>> <mailto:mich.talebza...@gmail.com>>:
>>>>
>>>> Hi F
tl. beigefügter Dateien umgehend. Das unerlaubte Kopieren, Nutzen oder
>>> Öffnen evtl. beigefügter Dateien sowie die unbefugte Weitergabe dieser
>>> E-Mail ist nicht gestattet.
>>>
>>> Am 28.06.2016 um 17:04 schrieb Mich Talebzadeh <
>>> mich.talebza..
ew?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>&
>>>
>>> HTH
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>> LinkedIn
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>
>>> <https://www.linkedin.com/profile/view?id=
wn risk. Any and all responsibility for any
>> loss, damage or destruction of data or any other property which may arise
>> from relying on this email's technical content is explicitly disclaimed. The
>> author will in no case be liable for any monetary damages arisin
your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such
's technical content is explicitly disclaimed. The
> author will in no case be liable for any monetary damages arising from such
> loss, damage or destruction.
>
>
> On 28 June 2016 at 15:27, adaman79 <mailto:felix.mas...@codecentric.de>> wrote:
> Hey guys,
>
ble for any monetary damages arising from
such loss, damage or destruction.
On 28 June 2016 at 15:27, adaman79 wrote:
> Hey guys,
>
> I have a problem with memory because over 90% of my spark driver will be
> started on one of my nine spark nodes.
> So now I am looking for the poss
Hey guys,
I have a problem with memory because over 90% of my spark driver will be
started on one of my nine spark nodes.
So now I am looking for the possibility to define the node the spark driver
will be started when using spark-submit or setting it somewhere in the code.
Is this possible
/create-cluster.html)
doesn’t have a similar argument.
Gerhard
From: Sonal Goyal [mailto:sonalgoy...@gmail.com]
Sent: Wed, Mar 09, 2016 04:28
To: Wang, Daoyuan
Cc: Gerhard Fiedler; user@spark.apache.org
Subject: Re: How to add a custom jar file to the Spark driver?
Hi Gerhard,
I just stumbled upon
rch 09, 2016 5:41 AM
> *To:* user@spark.apache.org
> *Subject:* How to add a custom jar file to the Spark driver?
>
>
>
> We’re running Spark 1.6.0 on EMR, in YARN client mode. We run Python code,
> but we want to add a custom jar file to the driver.
>
>
>
> When runni
updated SparkConf to
instantiate your SparkContext.
Thanks,
Daoyuan
From: Gerhard Fiedler [mailto:gfied...@algebraixdata.com]
Sent: Wednesday, March 09, 2016 5:41 AM
To: user@spark.apache.org
Subject: How to add a custom jar file to the Spark driver?
We're running Spark 1.6.0 on EMR, in YARN c
We're running Spark 1.6.0 on EMR, in YARN client mode. We run Python code, but
we want to add a custom jar file to the driver.
When running on a local one-node standalone cluster, we just use
spark.driver.extraClassPath and everything works:
spark-submit --conf spark.driver.extraClassPath=/path
what ports need to be exposed. With mesos we had a lot of problems with
>> container networking but yes the --net=host is a shortcut.
>>
>> Tamas
>>
>>
>>
>>> On 4 March 2016 at 22:37, yanlin wang wrote:
>>> We would like to run multiple spark d
e exposed. With mesos we had a lot of problems with
> container networking but yes the --net=host is a shortcut.
>
> Tamas
>
>
>
>> On 4 March 2016 at 22:37, yanlin wang wrote:
>> We would like to run multiple spark driver in docker container. Any
>> sugg
ple spark driver in docker container. Any
> suggestion for the port expose and network settings for docker so driver is
> reachable by the worker nodes? —net=“hosts” is the last thing we want to do.
>
> Thx
> Yanlin
> -
We would like to run multiple spark driver in docker container. Any suggestion
for the port expose and network settings for docker so driver is reachable by
the worker nodes? —net=“hosts” is the last thing we want to do.
Thx
Yanlin
>
> com.sun.jersey.api.core.ScanningResourceConfig.init(ScanningResourceConfig.java:79)
> at
>
> com.sun.jersey.api.core.PackagesResourceConfig.init(PackagesResourceConfig.java:104)
> at
>
> com.sun.jersey.api.core.PackagesResourceConfig.(PackagesResourceConfig.java:78)
>
Please help.
Thanks in advance.
Regards,
Rakesh
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Error-getting-response-from-spark-driver-rest-APIs-java-lang-IncompatibleClassChangeError-Implementis-tp25724.html
Sent from the Apache Spark User List mailing list
1 - 100 of 160 matches
Mail list logo