Actually disregard! Forgot that
spark.dynamicAllocation.cachedExecutorIdleTimeout was defaulted to Infinity,
so lowering that should solve the problem :)
Mark.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-6-0-Streaming-Persistance-Bug-tp161
Calling unpersist on an RDD in a spark streaming application does not
actually unpersist the blocks from memory and/or disk. After the RDD has
been processed in a .foreach(rdd) call, I attempt to unpersist the rdd since
it is no longer useful to store in memory/disk. This mainly causes a problem
wi
I reported this in the 1.6 preview thread, but wouldn't mind if someone can
confirm that ctrl-c is not keyboard interrupting / clearing the current line
of input anymore in the pyspark shell. I saw the change that would kill the
currently running job when using ctrl+c, but now the only way to clear
Nice! Built and testing on CentOS 7 on a Hadoop 2.7.1 cluster.
One thing I've noticed is that KeyboardInterrupts are now ignored? Is that
intended? I starting typing a line out and then changed my mind and wanted
to issue the good old ctrl+c to interrupt, but that didn't work.
Otherwise haven't s
Unfortunately setting the executor memory to prevent multiple executors from
the same framework would inherently mean that we'd need to set just over
half the available worker memory for each node. So if each node had 32GB of
worker memory, then the application would need to set 17GB to absolutely
Hi Richard,
Thanks for the response.
I should have added that the specific case where this becomes a problem is
when one of the executors for that application is lost/killed prematurely,
and the application attempts to spawn up a new executor without
consideration as to whether an executor alrea
Regarding the 'spark.executor.cores' config option in a Standalone spark
environment, I'm curious about whether there's a way to enforce the
following logic:
*- Max cores per executor = 4*
** Max executors PER application PER worker = 1*
In order to force better balance across all workers, I want
My apologies for mixing up what was being referred to in that case! :)
Mark.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14629.html
Sent from the Apache Spark Developers List mailing list a
Are you referring to spark.shuffle.manager=tungsten-sort? If so, we saw the
default value as still being as the regular sort, and since it was only
first introduced in 1.5, were actually waiting a bit to see if anyone
ENABLED it as opposed to DISABLING it since - it's disabled by default! :)
I rec
Built and tested on CentOS 7, Hadoop 2.7.1 (Built for 2.6 profile),
Standalone without any problems. Re-tested dynamic allocation specifically.
"Lost executor" messages are still an annoyance since they're expected to
occur with dynamic allocation, and shouldn't WARN/ERROR as they do now,
however
Just a heads up that this RC1 release is still appearing as "1.5.0-SNAPSHOT"
(Not just me right..?)
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-0-RC1-tp13780p13792.html
Sent from the Apache Spark Developers List mailin
Turns out it was a mix of user-error as well as a bug in the sbt/sbt build
that has since been fixed in the current 1.5 branch (I built from this
commit: b4f4e91c395cb69ced61d9ff1492d1b814f96828)
I've been testing out the dynamic allocation specifically and it's looking
pretty solid! Haven't come
Has anyone had success using this preview? We were able to build the preview,
and able to start the spark-master, however, unable to connect any spark
workers to it.
Kept receiving "AkkaRpcEnv address in use" while attempting to connect the
spark-worker to the master. Also confirmed that the work
We tested this out on our dev cluster (Hadoop 2.7.1 + Spark 1.4.0), and it
looks great! I might also be interested in contributing to it when I get a
chance! Keep up the awesome work! :)
Mark.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Spree-Live
Hi Jerry,
Thanks for the quick response! Looks like I'll need to come up with an
alternative solution in the meantime, since I'd like to avoid the other
input streams + WAL approach. :)
Thanks again,
Mark.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble
Hello,
I was interested in creating a StreamingContext textFileStream based job,
which runs for long durations, and can also recover from prolonged driver
failure... It seems like StreamingContext checkpointing is mainly used for
the case when the driver dies during the processing of an RDD, and t
I've noticed a couple oddities with the pyspark.daemons which are causing us
a bit of memory problems within some of our heavy spark jobs, especially
when they run at the same time...
It seems that there is typically a 1-to-1 ratio of pyspark.daemons to cores
per executor during aggregations. By d
17 matches
Mail list logo