Re: Mesos checkpointing

2017-05-24 Thread Michael Gummelt
then waiting for spark jobs to > recover, then rolling another agent is not at all practical. It is a huge > benefit if we can just update the agents in bulk (or even sequentially, but > only waiting for the mesos agent to recover). > > On Wed, May 24, 2017 at 11:17 AM Michael

Re: Mesos checkpointing

2017-05-24 Thread Michael Gummelt
sosSchedulerUtils# > createSchedulerDriver > >> > allows checkpointing, but only > >> > org.apache.spark.scheduler.cluster.mesos.MesosClusterScheduler uses > it. > >> > Is > >> > there a reason for that? > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > -- Michael Gummelt Software Engineer Mesosphere

Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-26 Thread Michael Gummelt
It seems that CPU usage is > just a "label" for an executor on Mesos. Where's this in the code? > > Pozdrawiam, > Jacek Laskowski > > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark > Follow me at

Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-26 Thread Michael Gummelt
;> On Mon, Dec 19, 2016 at 2:45 PM, Mehdi Meziane > >> wrote: > >> > We will be interested by the results if you give a try to Dynamic > >> allocation > >> > with mesos ! > >> > > >> > > >> > - Mail Original - > >

Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-19 Thread Michael Gummelt
> > > Regards > Sumit Chawla > > > On Mon, Dec 19, 2016 at 12:45 PM, Michael Gummelt > wrote: > >> > I should preassume that No of executors should be less than number of >> tasks. >> >> No. Each executor runs 0 or more tasks. >> >&

Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-19 Thread Michael Gummelt
s. >>> > When the program starts running, in mesos UI it shows 48 tasks and 48 >>> CPUs >>> > allocated to job. Now as the tasks get done, the number of active >>> tasks >>> > number starts decreasing. How ever, the number of CPUs does not >>> decrease >>> > propotionally. When the job was about to finish, there was a single >>> > remaininig task, however CPU count was still 20. >>> > >>> > My questions, is why there is no one to one mapping between tasks and >>> cpus >>> > in Fine grained? How can these CPUs be released when the job is done, >>> so >>> > that other jobs can start. >>> > >>> > >>> > Regards >>> > Sumit Chawla >>> >> >> > -- Michael Gummelt Software Engineer Mesosphere

Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-19 Thread Michael Gummelt
er when there is demand. This feature is > particularly useful if multiple applications share resources in your Spark > cluster. > > - Mail Original - > De: "Sumit Chawla" > À: "Michael Gummelt" > Cc: u...@mesos.apache.org, "Dev" , "U

Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-19 Thread Michael Gummelt
n Fine grained? How can these CPUs be released when the job is done, so > that other jobs can start. > > > Regards > Sumit Chawla > > -- Michael Gummelt Software Engineer Mesosphere

Re: driver in queued state and not started

2016-12-06 Thread Michael Gummelt
hould I check? > > > > Thanks, > > Jared, (韦煜) > Software developer > Interested in open source software, big data, Linux > -- Michael Gummelt Software Engineer Mesosphere

Re: Two questions about running spark on mesos

2016-11-14 Thread Michael Gummelt
submitted a long > running job succeeded. > > Then I want to kill the job. > How could I do that? Is there any similar commands as launching spark > on yarn? > > > Thanks, > > Jared, (韦煜) > Software developer > Interested in open source software, big data, Linux > -- Michael Gummelt Software Engineer Mesosphere

Re: [ANNOUNCE] Announcing Spark 2.0.1

2016-10-05 Thread Michael Gummelt
load Apache Spark 2.0.1, visit http://spark.apache.org/downlo >>> ads.html >>> >>> We would like to acknowledge all community members for contributing >>> patches to this release. >>> >>> >>> >> >> >> -- >> -- >> Cheers, >> Praj >> > > -- Michael Gummelt Software Engineer Mesosphere

Re: [VOTE] Release Apache Spark 2.0.1 (RC3)

2016-09-28 Thread Michael Gummelt
fixed. > > > > Please shout if you disagree. > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > -- Michael Gummelt Software Engineer Mesosphere

Re: Mesos is now a maven module

2016-08-30 Thread Michael Gummelt
e > others (e.g. only enabled when the YARN code changes). > > -- > Marcelo > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > -- Michael Gummelt Software Engineer Mesosphere

Re: Spark Kerberos proxy user

2016-08-30 Thread Michael Gummelt
>>- We have to load every single RDD that spark core over kerberized >>HDFS without breaking the Spark API. >> >> >> >> >> As you can see, We have a "special" requirement need to set the proxy >> user by job over the same spark context. >> >> Do you have any idea to cover it? >> >> > -- Michael Gummelt Software Engineer Mesosphere

Re: Mesos is now a maven module

2016-08-26 Thread Michael Gummelt
introduced by YARN). Think Standalone should follow the steps. WDYT? > > Pozdrawiam, > Jacek Laskowski > > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > >

Mesos is now a maven module

2016-08-26 Thread Michael Gummelt
Hello devs, Much like YARN, Mesos has been refactored into a Maven module. So when building, you must add "-Pmesos" to enable Mesos support. The pre-built distributions from Apache will continue to enable Mesos. PR: https://github.com/apache/spark/pull/14637 Cheers -- Micha

SASL Support

2016-08-08 Thread Michael Gummelt
ot; However, it seems that RPC can be SASL encrypted as well: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala#L64 Is this accurate? If so, I'll submit a PR to update the docs. -- Michael Gummelt Software Engineer Mesosphere

Re: Build changes after SPARK-13579

2016-07-19 Thread Michael Gummelt
This line: "build/sbt clean assembly" should also be changed, right? On Tue, Jul 19, 2016 at 1:18 AM, Sean Owen wrote: > If the change is just to replace "sbt assembly/assembly" with "sbt > package", done. LMK if there are more edits. > > On Mon,

Re: Build changes after SPARK-13579

2016-07-18 Thread Michael Gummelt
; yarn shuffle service). >> >> >> >> >> >> -- >> >> Marcelo >> >> >> >> - >> >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >

Re: [VOTE] Release Apache Spark 2.0.0 (RC4)

2016-07-14 Thread Michael Gummelt
> main release so we do not need to block the release due to documentation > errors either. > > > Note: There was a mistake made during "rc3" preparation, and as a result > there is no "rc3", but only "rc4". > > -- Michael Gummelt Software Engineer Mesosphere

Re: Spark performance regression test suite

2016-07-11 Thread Michael Gummelt
> a worthwhile contribution? Is anyone working on this? Do we have a Jira >>>> issue for it? >>>> >>>> I cannot commit to taking charge of such a project. I just thought it >>>> would be a great contribution for someone who does have the time

Remote JAR download in client mode

2016-05-10 Thread Michael Gummelt
g this is a bug. Can someone confirm that this is in fact a bug? If so, I'm happy to submit a PR. -- Michael Gummelt Software Engineer Mesosphere

Re: HDFS as Shuffle Service

2016-04-28 Thread Michael Gummelt
, Apr 28, 2016 at 11:19 AM, Mark Hamstra wrote: > So you are only considering the case where your set of HDFS nodes is > disjoint from your dynamic set of Spark Worker nodes? That would seem to > be a pretty significant sacrifice of data locality. > > On Thu, Apr 28, 2016 at

Re: HDFS as Shuffle Service

2016-04-28 Thread Michael Gummelt
an attractive idea in theory, in practice I think you are > substantially overestimating HDFS' ability to handle a lot of small, > ephemeral files. It has never really been optimized for that use case. > > On Thu, Apr 28, 2016 at 11:15 AM, Michael Gummelt > wrote: > >

Re: HDFS as Shuffle Service

2016-04-28 Thread Michael Gummelt
, 2016 at 1:36 AM, Sean Owen wrote: >> >>> Why would you run the shuffle service on 10K nodes but Spark executors >>> on just 100 nodes? wouldn't you also run that service just on the 100 >>> nodes? >>> >>> What does plumbing it through

Re: HDFS as Shuffle Service

2016-04-28 Thread Michael Gummelt
bing it through HDFS buy you in comparison? There's some > additional overhead and if anything you lose some control over > locality, in a context where I presume HDFS itself is storing data on > much more than the 100 Spark nodes. > > On Thu, Apr 28, 2016 at 1:34 AM, Michael Gumm

Re: HDFS as Shuffle Service

2016-04-27 Thread Michael Gummelt
gt; > > If someone did do this in RawLocalFS, it'd be nice if the patch also > allowed you to turn off CRC creation and checking. > > That's not only part of the overhead, it means that flush() doesn't, not > until you reach the end of a CRC32 block ... so breaking what few > durability guarantees POSIX offers. > > > > -- Michael Gummelt Software Engineer Mesosphere

HDFS as Shuffle Service

2016-04-26 Thread Michael Gummelt
Has there been any thought or work on this (or any other networked file system)? It would be valuable to support dynamic allocation without depending on the shuffle service. -- Michael Gummelt Software Engineer Mesosphere

Re: Accessing Secure Hadoop from Mesos cluster

2016-04-14 Thread Michael Gummelt
(my employer) gets working we would definitely be > interested in contributing it back and would very much want to avoid > maintaining a fork of Spark. > > Tony > > > -- Michael Gummelt Software Engineer Mesosphere

Re: [discuss] ending support for Java 7 in Spark 2.0

2016-03-28 Thread Michael Gummelt
with critical > fixes; newer features will require 2.x and so jdk8 > > > > Regards > > Mridul > > > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > > -- Michael Gummelt Software Engineer Mesosphere

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Michael Gummelt
>>>> spark.executor.extraLibraryPath >>>> >>>> /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64 >>>> - >>>> >>>> I will try to do some more debugging on this issue. >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-6-0-RC3-tp15660p15692.html >>>> Sent from the Apache Spark Developers List mailing list archive at >>>> Nabble.com. >>>> >>>> - >>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>> >>>> >>> >> > -- Michael Gummelt Software Engineer Mesosphere