Re: "pg_ctl: the PID file ... is empty" at end of make check

Tom Lane Tue, 27 Nov 2018 20:29:25 -0800

Thomas Munro <[email protected]> writes:
> Today I saw a one-off case of $SUBJECT, on macOS.  I can't reproduce
> it, but I noticed exactly the same thing on longfin the other day:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2018-11-25%2005%3A39%3A04


I trawled the buildfarm logs and discovered a second instance of exactly
the same thing:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2018-11-19%2018%3A37%3A00

There have not been any other occurrences in the past 3 months, which is
as far back as I went.  (lorikeet has half a dozen occurrences of "could
not stop postmaster", which is what I was grepping for, but they all
are associated with that machine's intermittent postmaster crashes.)

So that lets out the flaky-hardware theory: that occurrence is before
longfin's hardware transplant.

Also, I don't think I believe the OS-bug idea either, given that you
saw it on 10.14.0.  longfin's been running 10.14.something since
2018-09-26, and has accumulated circa 200 runs since then just on HEAD,
never mind the back branches.  It'd be pretty unlikely to see it only
in the past week, and only on HEAD, if it were an OS bug introduced two
months ago.

So my theory is we broke something in HEAD a couple weeks ago.  But what?

The fsync changes you made are suspiciously close to this issue (ie one
could explain it as written data not getting out), and were committed in
the right time frame, but that change didn't affect writes to
postmaster.pid did it?

                        regards, tom lane

Re: "pg_ctl: the PID file ... is empty" at end of make check

Reply via email to