Le 04/10/2022 à 00:54, Akshaya Annavajhala (AK) a écrit :
Thanks for the important clarification - reading up on UCX, it makes sense to implement a health check abstraction. A couple follow up musings: 1. "Hosting" vs "data plane" communication protocols. Clearly this isn't something worth bundling in right now, but it seems to me that health checks fall under a class of server behavior that is specific to the hosting environment as opposed to transporting data to the client efficiently. Going further, a simple non-GRPC HTTP health probe function would provide even more broad support for the server across multiple hosting platforms (eg, K8s without an experimental change) without changing any client facing semantics. Of course, this has added burden to the team maintaining the implementation, but something to consider/clarify as goals for a reference server implementation.
Transport-level health checks (HTTP being the transport for GRPC here) are often more efficient but they also don't check the actual service health. The HTTP server could be functional but with an unavailable (disabled, misconfigured...) GRPC service.
This means that we might even want a standard Ping or Heartbeat method at the Flight level, or perhaps that is overkill?
Regards Antoine.