Hi everyone,

we are implementing the /health endpoint in our services but omit the 
implementation of the unauthenticated lifecycle methods /quitquitquit and 
/abortabortabort. 

As a consequence, stopping a service is taxed by 10 seconds waiting time [1]. I 
would like to get rid of this unnecessary delay and can think of two solutions:

a) Only perform the escalation wait when the http_signaler reports that the 
message could be delivered to the service. This is a rather simple and 
localized fix.

b) Use another port for lifecycle events. This would require a new addition to 
the task configuration and proper plumbing throughout the rest of the system. 
Backward compatibility could be achieved by using 'health' as the default 
lifecycle management port. 

Any thoughts? I would be happy with the simple solution, but in the end it's 
your call :-) 

Best Regards,
Stephan

[1] 
https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/executor/thermos_task_runner.py#L123
 

Reply via email to