On 2021/03/26 9:25, Masahiro Ikeda wrote:

On 2021/03/25 21:23, Fujii Masao wrote:
On 2021/03/25 9:58, Andres Freund wrote:
Outside of very
narrow circumstances a normal exit shouldn't use _exit(1). Neither 1 no
_exit() (as opposed to exit()) seem appropriate. This seems like a bad
idea.

So you're thinking that the stats collector should do proc_exit(0)
or something even when it receives immediate shutdown request?

One idea to avoid using _exit(1) is to change the SIGQUIT handler
so that it just sets the flag. Then if the stats collector detects that
the flag is set in the main loop, it gets out of the loop,
skips writing the permanent stats file and then exits with exit(0).
That is, normal and immediate shutdown requests are treated
almost the same way in the stats collector. Only the difference of
them is whether it saves the stats to the file or not. Thought?

Although I don't oppose the idea to change from _exit(1) to proc_exit(0), the
reason why I used _exit(1) is that I thought the behavior to skip writing the
stat counters is not normal exit. Actually, other background processes use
_exit(2) instead of _exit(0) or proc_exit(0) in immediate shutdown although
the status code is different because they touch shared memory.

True.




Also, won't this lead to postmaster now starting to complain about
pgstat having crashed in an immediate shutdown? Right now only exit
status 0 is expected:

         if (pid == PgStatPID)
         {
             PgStatPID = 0;
             if (!EXIT_STATUS_0(exitstatus))
                 LogChildExit(LOG, _("statistics collector process"),
                              pid, exitstatus);
             if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
                 PgStatPID = pgstat_start();
             continue;
         }

No. In immediate shutdown case, pmdie() sets "Shutdown" variable to
ImmdiateShutdown and HandleChildCrash() doesn't complain that in that case
because of the following.

     /*
      * We only log messages and send signals if this is the first process
      * crash and we're not doing an immediate shutdown; otherwise, we're only
      * here to update postmaster's idea of live processes.  If we have already
      * signaled children, nonzero exit status is to be expected, so don't
      * clutter log.
      */
     take_action = !FatalError && Shutdown != ImmediateShutdown;

IIUC, in immediate shutdown case (and other cases too?), HandleChildCrash() is
never invoked due to the exit of pgstat. My understanding of Andres-san's
comment is that is it ok to show like the following log message?

```
LOG:  statistics collector process (PID 64991) exited with exit code 1
```

Surely, other processes don't output the log like it. So, I agree to suppress
the log message.

Yes. I was mistakenly thinking that HandleChildCrash() was called
even when the stats collector exits with non-zero code, like other processes.
But that's not true.

How should we do this? HandleChildCrash() calls LogChildExit()
when FatalError = false and Shutdown != ImmediateShutdown.
We should use the same conditions for the stats collector as follows?

                if (pid == PgStatPID)
                {
                        PgStatPID = 0;
-                       if (!EXIT_STATUS_0(exitstatus))
+                       if (!EXIT_STATUS_0(exitstatus) && !FatalError &&
+                               Shutdown != ImmediateShutdown)
                                LogChildExit(LOG, _("statistics collector 
process"),
                                                         pid, exitstatus);

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Reply via email to