Andy, doesn't Marathon handle fault tolerance amongst its apps? ie if you say that N instances of an app are running, and one shuts off, then it spins up another one no?
The tricky thing was that I was planning to use Akka Cluster to coordinate, but Mesos itself can be used to coordinate as well, which is an overlap/.... but I didn't want ot make job server HA just reliant only on Mesos... Anyways we can discuss offline if needed. On Thu, Mar 20, 2014 at 1:35 AM, andy petrella <andy.petre...@gmail.com> wrote: > Heya, > That's cool you've already hacked something for this in the scripts! > > I have a related question, how would it work actually. I mean, to have this > Job Server fault tolerant using Marathon, I would guess that it will need > to be itself a Mesos framework, and able to publish its resources needs. > And also, for that, the Job Server has to be aware of the resources needed > by the Spark drivers that it will run, which is not as easy to guess, > unless it is provided by the job itself? > > I didn't checked the Job Server deep enough so it might be already the case > (or I'm expressing something completely dumb ^^). > > For sure, we'll try to share it when we'll reach this point to deploy using > marathon (should be planned for April) > > greetz and again, Nice Work Evan! > > Ndi > > On Wed, Mar 19, 2014 at 7:27 AM, Evan Chan <e...@ooyala.com> wrote: > >> Andy, >> >> Yeah, we've thought of deploying this on Marathon ourselves, but we're >> not sure how much Mesos we're going to use yet. (Indeed if you look >> at bin/server_start.sh, I think I set up the PORT environment var >> specifically for Marathon.) This is also why we have deploy scripts >> which package into .tar.gz, again for Mesos deployment. >> >> If you do try this, please let us know. :) >> >> -Evan >> >> >> On Tue, Mar 18, 2014 at 3:57 PM, andy petrella <andy.petre...@gmail.com> >> wrote: >> > tadaaaa! That's awesome. >> > >> > A quick question, does someone has insights regarding having such >> > JobServers deployed using Marathon on Mesos? >> > >> > I'm thinking about an arch where Marathon would deploy and keep the Job >> > Servers running along with part of the whole set of apps deployed on it >> > regarding the resources needed (à la Jenkins). >> > >> > Any idea is welcome. >> > >> > Back to the news, Evan + Ooyala team: Great Job again. >> > >> > andy >> > >> > On Tue, Mar 18, 2014 at 11:39 PM, Henry Saputra <henry.sapu...@gmail.com >> >wrote: >> > >> >> W00t! >> >> >> >> Thanks for releasing this, Evan. >> >> >> >> - Henry >> >> >> >> On Tue, Mar 18, 2014 at 1:51 PM, Evan Chan <e...@ooyala.com> wrote: >> >> > Dear Spark developers, >> >> > >> >> > Ooyala is happy to announce that we have pushed our official, Spark >> >> > 0.9.0 / Scala 2.10-compatible, job server as a github repo: >> >> > >> >> > https://github.com/ooyala/spark-jobserver >> >> > >> >> > Complete with unit tests, deploy scripts, and examples. >> >> > >> >> > The original PR (#222) on incubator-spark is now closed. >> >> > >> >> > Please have a look; pull requests are very welcome. >> >> > -- >> >> > -- >> >> > Evan Chan >> >> > Staff Engineer >> >> > e...@ooyala.com | >> >> >> >> >> >> -- >> -- >> Evan Chan >> Staff Engineer >> e...@ooyala.com | >> -- -- Evan Chan Staff Engineer e...@ooyala.com |