Qingqing Zhou <zhouqq.postg...@gmail.com> writes: > I got another repro with the shutdown slowness (DEBUG5 with verbosed > log are attached).
> It gives a finer picture of what's going on: > 1. Avl ereport("autovacuum launcher shutting down"); > 2. At the end of errfinish(), it honors a pending SIGINT; > 3. SIGINT handler longjmp to the start of avl error handling; > 4. The error handling continues and rebuild_database_list() (that's > why we see begin/commit pair); > 5. In main loop, it WaitLatch(60 seconds); > 6. Finally it ereport() again and proc_exit(). > This looks like a general pattern - don't think *nix is immune. Notice > that this ereport() is special as there is way to go back. So we can > insert HOLD_INTERRUPTS() just before it. > Thoughts? That seems like (a) a hack, and (b) not likely to solve the problem completely, unless you leave interrupts held throughout proc_exit(), which would create all sorts of opportunities for corner case bugs during on_proc_exit hooks. I think changing the outer "for(;;)" to "while (!got_SIGTERM)" would be a much safer fix. It looks like there's a related risk associated with this bit: /* in emergency mode, just start a worker and go away */ if (!AutoVacuumingActive()) { do_start_worker(); proc_exit(0); /* done */ } If we get SIGHUP and see that autovacuum has been turned off, we exit the main loop, but we don't set got_SIGTERM. So if we then get a similar error at the shutdown report, we'd not merely waste some time, but actually incorrectly launch a child. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers