Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-05 Thread Tom Lane
Robert Haas writes: > I have to admit that I care less about the specific issue here than > about the general issue of being open to hearing what the user needs > actually are. I honestly have no idea whether it's sensible to want to > run postgres as init. If people who know about container stuff

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-05 Thread Robert Haas
On Tue, May 4, 2021 at 4:35 PM Tom Lane wrote: > I'm still thinking that we're best off refusing to do > that and making people install one of these shims that's meant > for the job. I have to admit that I care less about the specific issue here than about the general issue of being open to heari

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-04 Thread Justin Pryzby
On Tue, May 04, 2021 at 01:35:50PM -0400, Greg Stark wrote: > Fwiw, I have a suspicion that the right check for being init is > whether `pid == ppid`. pryzbyj@pryzbyj:~$ ps -wwf 1 UIDPID PPID C STIME TTY STAT TIME CMD root 1 0 0 2020 ?Ss10:28 /sbin/init

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-04 Thread Tom Lane
Andrew Dunstan writes: > On 5/3/21 5:13 PM, Dagfinn Ilmari Mannsåker wrote: >> Given that a number of minimal `init`s already exist specifically for >> the case of running a single application in a container, I don't think >> Postgres should to reinvent that wheel. A quick eyball of the output of

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-04 Thread Andrew Dunstan
On 5/3/21 5:13 PM, Dagfinn Ilmari Mannsåker wrote: > Tom Lane writes: > > >> Maybe we should put in a startup-time check, analogous to the >> can't-run-as-root test, that the postmaster mustn't be PID 1. > Given that a number of minimal `init`s already exist specifically for > the case of runnin

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-04 Thread Robert Haas
On Tue, May 4, 2021 at 2:26 PM Tom Lane wrote: > You are arguing from assumptions not in evidence, specifically that > if we reap a PID that isn't one we recognize, this must be what > happened. I think it's *at least* as likely that the case implies > some bug in the postmaster's child-process b

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-04 Thread Tom Lane
Robert Haas writes: > On Mon, May 3, 2021 at 3:37 PM Tom Lane wrote: >> I think that'd be a net reduction in reliability, not an improvement. >> In most scenarios it'd do little except mask bugs. And who's to say >> that ignoring unexpected child deaths is okay, anyway? We could hardly >> be su

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-04 Thread Joe Conway
On 5/4/21 1:43 PM, Tom Lane wrote: Greg Stark writes: On Mon, 3 May 2021 at 15:44, Tom Lane wrote: BTW, as far as that goes, I think the general recommendation is that the datadir shouldn't be a mount point, because bad things happen if you mount or unmount the drive while the postmaster is u

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-04 Thread Robert Haas
On Mon, May 3, 2021 at 3:37 PM Tom Lane wrote: > > I guess we can do that in older releases, but do we really need it? As > > I understand, the only thing we need to do is verify that the dying PID > > is a backend PID, and not cause a crash cycle if it isn't. > > I think that'd be a net reductio

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-04 Thread Tom Lane
Greg Stark writes: > On Mon, 3 May 2021 at 15:44, Tom Lane wrote: >> BTW, as far as that goes, I think the general recommendation is that >> the datadir shouldn't be a mount point, because bad things happen if >> you mount or unmount the drive while the postmaster is up. I could >> see enforcing

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-04 Thread Greg Stark
On Mon, 3 May 2021 at 15:44, Tom Lane wrote: > > Alvaro Herrera writes: > > I also heard a story where things ran into trouble (I didn't get the > > whole story of *what* was the problem with that) because the datadir is /. > > BTW, as far as that goes, I think the general recommendation is that

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Tom Lane
Andres Freund writes: > On 2021-05-03 16:20:43 -0400, Tom Lane wrote: >> Hmm, by that argument, any unexpected child PID in reaper() ought to be >> grounds for a restart, regardless of its exit code. Which'd be fine by >> me. I'm on board with being more restrictive about this, not less so. > A

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Dagfinn Ilmari Mannsåker
Tom Lane writes: > Alvaro Herrera writes: >> On 2021-May-03, Andres Freund wrote: >>> The issue turns out to be that postgres was in a container, with pid >>> namespaces enabled. Because postgres was run directly in the container, >>> without a parent process inside, it thus becomes pid 1. Which

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Andres Freund
Hi, On 2021-05-03 16:20:43 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2021-05-03 15:37:24 -0400, Tom Lane wrote: > >> And who's to say that ignoring unexpected child deaths is okay, > >> anyway? We could hardly be sure that the dead process hadn't been > >> connected to shared memory.

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Andrew Dunstan
On 5/3/21 3:07 PM, Andres Freund wrote: > Hi, > > A colleague debugged an issue where their postgres was occasionally > crash-restarting under load. > > The cause turned out to be that a relatively complex archive_command was > used, which could in some rare circumstances have a bash subshell > p

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Alvaro Herrera
On 2021-May-03, Andres Freund wrote: > Using / for a single statically linked binary that e.g. just serves a > bunch of hardcoded files is one thing. Putting actual data in / for > something like postgres another. Yeah, I just had a word with them and I had misunderstood what they were doing. Th

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Andres Freund
Hi, On 2021-05-03 15:25:53 -0400, Alvaro Herrera wrote: > I also heard a story where things ran into trouble (I didn't get the > whole story of *what* was the problem with that) because the datadir is /. > I know -- nobody in their right mind would put the datadir in / -- but > apparently in the c

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Tom Lane
Andres Freund writes: > On 2021-05-03 15:37:24 -0400, Tom Lane wrote: >> And who's to say that ignoring unexpected child deaths is okay, >> anyway? We could hardly be sure that the dead process hadn't been >> connected to shared memory. > I don't think checking the exit status of unexpected chil

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Andres Freund
Hi, On 2021-05-03 15:37:24 -0400, Tom Lane wrote: > Alvaro Herrera writes: > > On 2021-May-03, Andres Freund wrote: > >> The issue turns out to be that postgres was in a container, with pid > >> namespaces enabled. Because postgres was run directly in the container, > >> without a parent process

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Alvaro Herrera
On 2021-May-03, Tom Lane wrote: > Alvaro Herrera writes: > > I also heard a story where things ran into trouble (I didn't get the > > whole story of *what* was the problem with that) because the datadir is /. > > BTW, as far as that goes, I think the general recommendation is that > the datadir

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Tom Lane
Alvaro Herrera writes: > I also heard a story where things ran into trouble (I didn't get the > whole story of *what* was the problem with that) because the datadir is /. BTW, as far as that goes, I think the general recommendation is that the datadir shouldn't be a mount point, because bad thing

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Tom Lane
Alvaro Herrera writes: > On 2021-May-03, Andres Freund wrote: >> The issue turns out to be that postgres was in a container, with pid >> namespaces enabled. Because postgres was run directly in the container, >> without a parent process inside, it thus becomes pid 1. Which mostly >> works without

Re: PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Alvaro Herrera
On 2021-May-03, Andres Freund wrote: > The issue turns out to be that postgres was in a container, with pid > namespaces enabled. Because postgres was run directly in the container, > without a parent process inside, it thus becomes pid 1. Which mostly > works without a problem. Until, as the case

PG in container w/ pid namespace is init, process exits cause restart

2021-05-03 Thread Andres Freund
Hi, A colleague debugged an issue where their postgres was occasionally crash-restarting under load. The cause turned out to be that a relatively complex archive_command was used, which could in some rare circumstances have a bash subshell pipeline not succeed. It wasn't at all obvious why that'