Sharded Service Coordination

Oliver, James Wed, 09 Jul 2014 10:26:42 -0700

Good morning,

My company is in the process of adopting Apache's open source stack, and I've 
been tasked with building Aurora jobs to deploy a few popular open source 
technologies on Mesos. Aurora is an elegant scheduler and in our estimation 
will met our company's needs. However, we are struggling to meet some of the 
configuration requirements of some of the tools we wish to deploy.


Scenario: When a distributed service is deployed, we need to programmatically 
determine all hosts selected for deployment and their reserved ports in order 
to properly configure the service. We've solved this in a few not-so-elegant 
ways:

 1.  We wrote a Process to publish host/port information to a distributed file 
system, block until {{instances}} were written, read the information and 
finally configure the service. This works, but IMO it is not an elegant 
solution.
 2.  Next, we designed a REST API for service registration (based off of the 
Aurora job key) and published this information to our ZooKeeper ensemble. This 
solution removes the dependency on a pre-configured distributed file system. 
However, some overhead is still required (multiple instances are necessary so 
as to not introduce a point of failure). Aurora jobs now require some initial 
configuration to be able to communicate with the service. Communication to this 
API is a little non-trivial because the REST service doesn't block until all 
information is communicated – this would be problematic due to the nature of 
the HTTP protocol.

At this point, we realized that an even better solution might be to communicate 
directly to Aurora scheduler to get this data. Seeing as Aurora scripts are 
just Python, we could probably implement it in a reusable fashion…but I'm 
curious if anyone has already gone down this path?

Thank you,
James O

Sharded Service Coordination

Reply via email to