Re: Sharded Service Coordination

Bill Farner Wed, 09 Jul 2014 13:36:43 -0700

FWIW at Twitter we do something that sounds like a mix between (1) and (2).
 We consume the (otherwise unused) Announcer configuration parameter [1] to
Job, and extend the executor to publish all allocated {{thermos.port[x]}}s
to ZooKeeper.  This is something we would love to open source, so please
nudge us to do so if you would like this feature!


[1]
https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/config/schema/base.py#L69

-=Bill


On Wed, Jul 9, 2014 at 10:24 AM, Oliver, James <james.oli...@pegs.com>
wrote:

> Good morning,
>
> My company is in the process of adopting Apache's open source stack, and
> I've been tasked with building Aurora jobs to deploy a few popular open
> source technologies on Mesos. Aurora is an elegant scheduler and in our
> estimation will met our company's needs. However, we are struggling to meet
> some of the configuration requirements of some of the tools we wish to
> deploy.
>
> Scenario: When a distributed service is deployed, we need to
> programmatically determine all hosts selected for deployment and their
> reserved ports in order to properly configure the service. We've solved
> this in a few not-so-elegant ways:
>
>  1.  We wrote a Process to publish host/port information to a distributed
> file system, block until {{instances}} were written, read the information
> and finally configure the service. This works, but IMO it is not an elegant
> solution.
>  2.  Next, we designed a REST API for service registration (based off of
> the Aurora job key) and published this information to our ZooKeeper
> ensemble. This solution removes the dependency on a pre-configured
> distributed file system. However, some overhead is still required (multiple
> instances are necessary so as to not introduce a point of failure). Aurora
> jobs now require some initial configuration to be able to communicate with
> the service. Communication to this API is a little non-trivial because the
> REST service doesn't block until all information is communicated – this
> would be problematic due to the nature of the HTTP protocol.
>
> At this point, we realized that an even better solution might be to
> communicate directly to Aurora scheduler to get this data. Seeing as Aurora
> scripts are just Python, we could probably implement it in a reusable
> fashion…but I'm curious if anyone has already gone down this path?
>
> Thank you,
> James O
>

Re: Sharded Service Coordination

Reply via email to