Re: [Proposal] Add foreign-server health checks infrastructure

Önder Kalacı Wed, 19 Oct 2022 08:18:40 -0700

Hi,

> As far as I can think of, it should probably be a single background task
> > checking whether the server is down. If so, sending an invalidation
> message
> > to all the backends such that related backends could act on the
> > invalidation and throw an error. This is to cover the use-case you
> > described on [1].
>
> Indeed your approach covers the use case I said, but I'm not sure whether
> it is really good.
> In your approach, once the background worker process will manage all
> foreign servers.
> It may be OK if there are a few servers, but if there are hundreds of
> servers,
> the time interval during checks will be longer.
>


I expect users typically will have a lot more backends than the servers. We
can have a threshold for spinning a new bg worker (e.g., every 10 servers
gets a new bg worker etc.). Still, I think that'd be an optimization that
is probably not necessary for the majority of the users?


> Currently, each FDW can decide whether we do health checks or not per the
> backend.
> For example, we can skip health checks if the foreign server is not used
> now.
> The background worker cannot control such a way.
> Based on the above, I do not agree that we introduce a new background
> worker and make it to do a health check.
>

Again, the definition of "health check" is probably different for me. I'd
expect the health check to happen continuously, ideally keeping track of
how many consecutive times it succeeded and/or last time it
failed/succeeded etc.

A transaction failing with a bad error message (or holding some resources
locally until the transaction is committed) doesn't sound essential to me.
Is there any specific workload are you referring for optimizing to rollback
a transaction earlier if a remote server dies?  What kind of workload would
benefit from that? Maybe there is, but not clear to me and haven't seen
discussed on the thread (sorry if I missed).

I'm trying to understand if we are trying to solve a problem that does not
really exists. I'm bringing this up, because I often deal with
architectures where there is a local node and remote transaction on
different Postgres servers. And, I have not encountered many (or any)
pattern that'd benefit from this change much. In fact, I think, on the
contrary, this might add significant overhead for OLTP type of high query
throughput systems.


> Moreover, methods to connect to foreign servers and check health are
> different per FDW.
> In terms of mysql_fdw [1], we must do mysql_init() and
> mysql_real_connect().
> About file_fdw, we do not have to connect, but developers may want to
> calculate checksum and compare.
> Therefore, we must provide callback functions anyway.
>
>
I think providing callback functions is useful for any case. Each fdw (or
in general extension) should be able to provide its own "health check"
function.

Thanks,
Onder KALACI

Re: [Proposal] Add foreign-server health checks infrastructure

Reply via email to