Bill, I'm hoping we could also pull information like host, port, process, state, etc from WAL replica logs stored in scheduler. No?
On Wed, Jul 9, 2014 at 1:35 PM, Bill Farner <wfar...@apache.org> wrote: > FWIW at Twitter we do something that sounds like a mix between (1) and (2). > We consume the (otherwise unused) Announcer configuration parameter [1] to > Job, and extend the executor to publish all allocated {{thermos.port[x]}}s > to ZooKeeper. This is something we would love to open source, so please > nudge us to do so if you would like this feature! > > [1] > https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/config/schema/base.py#L69 > > -=Bill > > > On Wed, Jul 9, 2014 at 10:24 AM, Oliver, James <james.oli...@pegs.com> > wrote: > >> Good morning, >> >> My company is in the process of adopting Apache's open source stack, and >> I've been tasked with building Aurora jobs to deploy a few popular open >> source technologies on Mesos. Aurora is an elegant scheduler and in our >> estimation will met our company's needs. However, we are struggling to meet >> some of the configuration requirements of some of the tools we wish to >> deploy. >> >> Scenario: When a distributed service is deployed, we need to >> programmatically determine all hosts selected for deployment and their >> reserved ports in order to properly configure the service. We've solved >> this in a few not-so-elegant ways: >> >> 1. We wrote a Process to publish host/port information to a distributed >> file system, block until {{instances}} were written, read the information >> and finally configure the service. This works, but IMO it is not an elegant >> solution. >> 2. Next, we designed a REST API for service registration (based off of >> the Aurora job key) and published this information to our ZooKeeper >> ensemble. This solution removes the dependency on a pre-configured >> distributed file system. However, some overhead is still required (multiple >> instances are necessary so as to not introduce a point of failure). Aurora >> jobs now require some initial configuration to be able to communicate with >> the service. Communication to this API is a little non-trivial because the >> REST service doesn't block until all information is communicated – this >> would be problematic due to the nature of the HTTP protocol. >> >> At this point, we realized that an even better solution might be to >> communicate directly to Aurora scheduler to get this data. Seeing as Aurora >> scripts are just Python, we could probably implement it in a reusable >> fashion…but I'm curious if anyone has already gone down this path? >> >> Thank you, >> James O >> -- Regards, Bhuvan Arumugam www.livecipher.com