On Thu, Aug 18, 2022 at 9:01 AM Michael Paquier <mich...@paquier.xyz> wrote: > > PANIC is not something you'd care about as the system would go down as > and shared memory would be reset (right?) even if restart_on_crash is > enabled. Perhaps it would help here to use something like a macro to > catch and save the error, in a style similar to what's in hba.c for > example, which is the closest example I can think of, even if on ERROR > we don't really care about the error string anyway as there is nothing > to report back to the SQL views used for the HBA/ident files. > > FATAL may prove to be tricky though, because I'd expect the error to > be saved in shared memory in this case. This is particularly critical > as this takes the WAL receiver process down, actually.
Hm, we can use error callbacks or pg try/catch blocks to save the error message into walreceiver shared memory. > Anyway, outside the potential scope of the proposal, there are more > things that I find strange with the code: > - Why isn't the string reset when the WAL receiver is starting up? > That surely is not OK to keep a past state not referring to what > actually happens with a receiver currently running. I agree that it's not a good way to show some past failure state when things are fine currently. Would naming the column name as last_connectivity_error or something better and describing it in the docs clearly help here? Otherwise, we can have another simple function that just returns the last connection failure of walreceiver and if required PID. > - pg_stat_wal_receiver (system view) reports no rows if pid is NULL, > which would be the state stored in shared memory after a connection. > This means that one would never be able to see last_conn_error except > when calling directly the SQL function pg_stat_get_wal_receiver(). > > One could say that we should report a row for this view all the time, > but this creates a compatibility breakage: existing application > assuming something like (one row <=> WAL receiver running) could > break. -1. We can think of having a separate infrastructure for reporting all backend or process specific errors similar to pg_stat_activity and pg_stat_get_activity, but that needs some shared memory and all of that - IMO, it's an overkill. I'm fine to withdraw this thread, if none of the above thoughts is sensible enough to pursue further. Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com