At Mon, 2 Sep 2019 15:51:34 +0900, Michael Paquier <mich...@paquier.xyz> wrote 
in <20190902065134.ge1...@paquier.xyz>
> On Mon, Sep 02, 2019 at 12:27:09AM +0000, Tsunakawa, Takayuki wrote:
> > From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> >> After investigation, the mechanism that's causing that is that the
> >> src/test/recovery/t/010_logical_decoding_timelines.pl test shuts
> >> down its replica server with a mode-immediate stop, which causes
> >> that postmaster to shut down all its children with SIGQUIT, and
> >> in particular that signal propagates to a "cp" command that the
> >> archiver process is executing.  The "cp" is unsurprisingly running
> >> with default SIGQUIT handling, which per the signal man page
> >> includes dumping core.
> > 
> > We've experienced this (core dump in the data directory by an
> > archive command) years ago.  Related to this, the example of using
> > cp in the PostgreSQL manual is misleading, because cp doesn't
> > reliably persist the WAL archive file.
> 
> The previous talks about having pg_copy are still where they were a
> couple of years ago as we did not agree on which semantics it should
> have.  If we could move forward with that and update the documentation
> from its insanity that would be great and...  The signal handling is
> something else we could customize in a more favorable way with the
> archiver.  Anyway, switching from something else than SIGQUIT to stop
> the archiver will not prevent any other tools from generating core
> dumps with this other signal.

Since we are allowing OPs to use arbitrary command as
archive_command, providing a replacement with non-standard signal
handling for a specific command doesn't seem a general solution
to me. Couldn't we have pg_system(a tentative name), which
intercepts SIGQUIT then sends SIGINT to children? Might be need
to resend SIGQUIT after some interval, though..

> > We enable the core dump in production to help the investigation just in 
> > case.
> 
> So do I in some of the stuff I work on.
> 
> > some_command also catches SIGQUIT just exit.  It copies and syncs the file.
> > 
> > I proposed something in this line as below, but I couldn't respond to 
> > Peter's review comments due to other tasks.  Does anyone think it's worth 
> > resuming this?
> > 
> > https://www.postgresql.org/message-id/7E37040CF3804EA5B018D7A022822984@maumau
> 
> And I was looking for this thread a couple of lines ago :)
> Thanks.

# Is there any means to view the whole of a thread from archive?
# I'm a kind of reluctant to wander among messages like a rat in
# a maze:p

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


Reply via email to