subject:"\[HACKERS\] Unportable implementation of background worker start"

Re: [HACKERS] Unportable implementation of background worker start

2017-04-27 Thread Tom Lane

Andres Freund writes: > On 2017-04-27 16:35:29 -0400, Tom Lane wrote: >> It looks like it might be sufficient to do "#ifdef EPOLL_CLOEXEC" >> in latch.c, rather than bothering with a full-blown configure check. > Yea, that sounds worth trying. Wonder if we need to care about kernels > not suppor

Re: [HACKERS] Unportable implementation of background worker start

2017-04-27 Thread Andres Freund

On 2017-04-27 16:35:29 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2017-04-26 17:05:39 -0400, Tom Lane wrote: > >> I went ahead and changed the call to epoll_create into epoll_create1. > >> I'm not too concerned about loss of portability there --- it seems > >> unlikely that many people

Re: [HACKERS] Unportable implementation of background worker start

2017-04-27 Thread Tom Lane

Andres Freund writes: > On 2017-04-26 17:05:39 -0400, Tom Lane wrote: >> I went ahead and changed the call to epoll_create into epoll_create1. >> I'm not too concerned about loss of portability there --- it seems >> unlikely that many people are still using ten-year-old glibc, and >> even less lik

Re: [HACKERS] Unportable implementation of background worker start

2017-04-26 Thread Tom Lane

Robert Haas writes: > On Wed, Apr 26, 2017 at 8:37 PM, Andres Freund wrote: >>> 3. Go ahead with converting the postmaster to use WaitEventSet, a la >>> the draft patch I posted earlier. I'd be happy to do this if we were >>> at the start of a devel cycle, but right now seems a bit late --- not

Re: [HACKERS] Unportable implementation of background worker start

2017-04-26 Thread Andres Freund

On 2017-04-26 17:05:39 -0400, Tom Lane wrote: > Here's an updated version of that, which makes use of our previous > conclusion that F_SETFD/FD_CLOEXEC are available everywhere except > Windows, and fixes some sloppy thinking about the EXEC_BACKEND case. > > I went ahead and changed the call to ep

Re: [HACKERS] Unportable implementation of background worker start

2017-04-26 Thread Robert Haas

On Wed, Apr 26, 2017 at 8:37 PM, Andres Freund wrote: >> 3. Go ahead with converting the postmaster to use WaitEventSet, a la >> the draft patch I posted earlier. I'd be happy to do this if we were >> at the start of a devel cycle, but right now seems a bit late --- not >> to mention that we real

Re: [HACKERS] Unportable implementation of background worker start

2017-04-26 Thread Andres Freund

On 2017-04-26 11:42:38 -0400, Tom Lane wrote: > 1. Let HEAD stand as it is. We have a problem with slow response to > bgworker start requests that arrive while ServerLoop is active, but that's > a pretty tight window usually (although I believe I've seen it hit at > least once in testing). > > 2.

Re: [HACKERS] Unportable implementation of background worker start

2017-04-26 Thread Tom Lane

Andres Freund writes: > I'd still like to get something like your CLOEXEC patch applied > independently however. Here's an updated version of that, which makes use of our previous conclusion that F_SETFD/FD_CLOEXEC are available everywhere except Windows, and fixes some sloppy thinking about the

Re: [HACKERS] Unportable implementation of background worker start

2017-04-26 Thread Tom Lane

I wrote: > =?utf-8?Q?R=C3=A9mi_Zara?= writes: >> coypu was not stuck (no buildfarm related process running), but failed to >> clean-up shared memory and semaphores. >> I’ve done the clean-up. > Huh, that's even more interesting. I installed NetBSD 5.1.5 on an old Mac G4; I believe this is a rea

Re: [HACKERS] Unportable implementation of background worker start

2017-04-25 Thread Tom Lane

=?utf-8?Q?R=C3=A9mi_Zara?= writes: >> Le 25 avr. 2017 à 01:47, Tom Lane a écrit : >> It looks like coypu is going to need manual intervention (ie, kill -9 >> on the leftover postmaster) to get unwedged :-(. That's particularly >> disturbing because it implies that ServerLoop isn't iterating at a

Re: [HACKERS] Unportable implementation of background worker start

2017-04-24 Thread Rémi Zara

> Le 25 avr. 2017 à 01:47, Tom Lane a écrit : > > I wrote: >> What I'm inclined to do is to revert the pselect change but not the other, >> to see if that fixes these two animals. If it does, we could look into >> blacklisting these particular platforms when choosing pselect. > > It looks like

Re: [HACKERS] Unportable implementation of background worker start

2017-04-24 Thread Tom Lane

I wrote: > What I'm inclined to do is to revert the pselect change but not the other, > to see if that fixes these two animals. If it does, we could look into > blacklisting these particular platforms when choosing pselect. It looks like coypu is going to need manual intervention (ie, kill -9 on

Re: [HACKERS] Unportable implementation of background worker start

2017-04-24 Thread Tom Lane

Andres Freund writes: > On 2017-04-24 18:14:41 -0400, Tom Lane wrote: >> A bit of googling establishes that NetBSD 5.1 has a broken pselect >> implementation: >> >> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=43625 > Yikes. Do I understand correctly that they effectively just mapp

Re: [HACKERS] Unportable implementation of background worker start

2017-04-24 Thread Andres Freund

On 2017-04-24 18:14:41 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2017-04-24 17:33:39 -0400, Tom Lane wrote: > >> coypu's problem is unrelated: > > > Note I was linking the 9.6 report form coypu, not HEAD. Afaics the 9.6 > > failure is the same as gharial's mode of failure. > > [ look

Re: [HACKERS] Unportable implementation of background worker start

2017-04-24 Thread Tom Lane

Andres Freund writes: > On 2017-04-24 17:33:39 -0400, Tom Lane wrote: >> coypu's problem is unrelated: > Note I was linking the 9.6 report form coypu, not HEAD. Afaics the 9.6 > failure is the same as gharial's mode of failure. [ looks closer... ] Oh: the 9.6 run occurred first, and the failure

Re: [HACKERS] Unportable implementation of background worker start

2017-04-24 Thread Andres Freund

On 2017-04-24 17:33:39 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2017-04-24 13:16:44 -0700, Andres Freund wrote: > >> Unclear if related, but > >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2017-04-24%2019%3A30%3A42 > >> has a suspicious timing of failing in a w

Re: [HACKERS] Unportable implementation of background worker start

2017-04-24 Thread Tom Lane

Andres Freund writes: > On 2017-04-24 13:16:44 -0700, Andres Freund wrote: >> Unclear if related, but >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2017-04-24%2019%3A30%3A42 >> has a suspicious timing of failing in a weird way. > Given that gharial is also failing on 9.6 (

Re: [HACKERS] Unportable implementation of background worker start

2017-04-24 Thread Andres Freund

On 2017-04-24 13:16:44 -0700, Andres Freund wrote: > Unclear if related, but > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2017-04-24%2019%3A30%3A42 > has a suspicious timing of failing in a weird way. Given that gharial is also failing on 9.6 (same set of commits) and coypu

Re: [HACKERS] Unportable implementation of background worker start

2017-04-24 Thread Andres Freund

On 2017-04-21 23:50:41 -0400, Tom Lane wrote: > I wrote: > > Attached is a lightly-tested draft patch that converts the postmaster to > > use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions > > about whether this is the direction to proceed, though. > > Attached are a couple of

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Andres Freund

On 2017-04-21 23:50:41 -0400, Tom Lane wrote: > Attached are a couple of patches that represent a plausible Plan B. > The first one changes the postmaster to run its signal handlers without > specifying SA_RESTART. I've confirmed that that seems to fix the > select_parallel-test-takes-a-long-time

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Tom Lane

I wrote: > Attached is a lightly-tested draft patch that converts the postmaster to > use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions > about whether this is the direction to proceed, though. Attached are a couple of patches that represent a plausible Plan B. The first one c

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Tom Lane

Alvaro Herrera writes: > Tom Lane wrote: >> It also appears to me that do_start_bgworker's treatment of fork >> failure is completely brain dead. Did anyone really think about >> that case? > Hmm, I probably modelled it on autovacuum without giving that case much > additional consideration. Att

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Tom Lane

Tom Lane writes: > Andres Freund writes: >> On 2017-04-21 14:08:21 -0400, Tom Lane wrote: >>> but I see that SUSv2 >>> mandates that fcntl.h provide both F_SETFD and FD_CLOEXEC, so by our own >>> coding rules it ought to be okay to assume they're there. I'm tempted to >>> rip out the quoted bit,

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Tom Lane

Andres Freund writes: > On 2017-04-21 14:08:21 -0400, Tom Lane wrote: >> but I see that SUSv2 >> mandates that fcntl.h provide both F_SETFD and FD_CLOEXEC, so by our own >> coding rules it ought to be okay to assume they're there. I'm tempted to >> rip out the quoted bit, as well as the #ifdef F_

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Andres Freund

Tom, On 2017-04-21 13:49:27 -0400, Tom Lane wrote: > >> * If we are in PM_WAIT_DEAD_END state, then we don't want to accept > >> - * any new connections, so we don't call select(), and just > >> sleep. > >> + * any new connections, so we don't call WaitEventSetWait(), > >> an

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Andres Freund

On 2017-04-21 14:08:21 -0400, Tom Lane wrote: > but I see that SUSv2 > mandates that fcntl.h provide both F_SETFD and FD_CLOEXEC, so by our own > coding rules it ought to be okay to assume they're there. I'm tempted to > rip out the quoted bit, as well as the #ifdef F_SETFD, from libpq and see > i

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Tom Lane

I wrote: > Andres Freund writes: >> On 2017-04-21 12:50:04 -0400, Tom Lane wrote: >>> +#ifndef FD_CLOEXEC >>> +#define FD_CLOEXEC 1 >>> +#endif >> Hm? Are we sure this is portable? Is there really cases that have >> F_SETFD, but not CLOEXEC? > Copied-and-pasted from our only existing use of FD_

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Tom Lane

Andres Freund writes: > On 2017-04-21 12:50:04 -0400, Tom Lane wrote: >> Attached is a lightly-tested draft patch that converts the postmaster to >> use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions >> about whether this is the direction to proceed, though. It adds at least >

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Tom Lane

Alvaro Herrera writes: > Tom Lane wrote: >> I'm coming back to the idea that at least in the back branches, the >> thing to do is allow maybe_start_bgworker to start multiple workers. >> >> Is there any actual evidence for the claim that that might have >> bad side effects? > Well, I ran tests w

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Andres Freund

Hi, On 2017-04-21 12:50:04 -0400, Tom Lane wrote: > Attached is a lightly-tested draft patch that converts the postmaster to > use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions > about whether this is the direction to proceed, though. It adds at least > a couple of kernel cal

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Tom Lane

Attached is a lightly-tested draft patch that converts the postmaster to use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions about whether this is the direction to proceed, though. It adds at least a couple of kernel calls per postmaster signal delivery, and probably to every po

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Alvaro Herrera

Tom Lane wrote: > After sleeping and thinking more, I've realized that the > slow-bgworker-start issue actually exists on *every* platform, it's just > harder to hit when select() is interruptable. But consider the case > where multiple bgworker-start requests arrive while ServerLoop is > activel

Re: [HACKERS] Unportable implementation of background worker start

2017-04-21 Thread Tom Lane

Andres Freund writes: > On 2017-04-20 00:50:13 -0400, Tom Lane wrote: >> My first reaction was that that sounded like a lot more work than removing >> two lines from maybe_start_bgworker and adjusting some comments. But on >> closer inspection, the slow-bgworker-start issue isn't the only problem

Re: [HACKERS] Unportable implementation of background worker start

2017-04-20 Thread Andres Freund

On 2017-04-20 20:10:41 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2017-04-20 20:05:02 -0400, Tom Lane wrote: > >> Also, if it's not there we'd fall back to using plain poll(), which is > >> not so awful that we need to work hard to avoid it. I'd just as soon > >> keep the number of com

Re: [HACKERS] Unportable implementation of background worker start

2017-04-20 Thread Tom Lane

Andres Freund writes: > On 2017-04-20 20:05:02 -0400, Tom Lane wrote: >> Also, if it's not there we'd fall back to using plain poll(), which is >> not so awful that we need to work hard to avoid it. I'd just as soon >> keep the number of combinations down. > Just using fcntl(SET, CLOEXEC) wound'

Re: [HACKERS] Unportable implementation of background worker start

2017-04-20 Thread Andres Freund

On 2017-04-20 20:05:02 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2017-04-20 19:53:02 -0400, Tom Lane wrote: > >> So ... what would you say to replacing epoll_create() with > >> epoll_create1(EPOLL_CLOEXEC) ? Then a WaitEventSet would not > >> represent inheritable-across-exec resource

Re: [HACKERS] Unportable implementation of background worker start

2017-04-20 Thread Tom Lane

Andres Freund writes: > On 2017-04-20 19:53:02 -0400, Tom Lane wrote: >> So ... what would you say to replacing epoll_create() with >> epoll_create1(EPOLL_CLOEXEC) ? Then a WaitEventSet would not >> represent inheritable-across-exec resources on any platform, >> making it a lot easier to deal wit

Re: [HACKERS] Unportable implementation of background worker start

2017-04-20 Thread Andres Freund

On 2017-04-20 19:53:02 -0400, Tom Lane wrote: > I wrote: > > Andres Freund writes: > >> On 2017-04-20 19:23:28 -0400, Tom Lane wrote: > >>> or are the HANDLEs in a Windows WaitEventSet not inheritable > >>> resources? > > >> So that kind of sounds like it should be doable. > > > Ah, good. I'll

Re: [HACKERS] Unportable implementation of background worker start

2017-04-20 Thread Tom Lane

I wrote: > Andres Freund writes: >> On 2017-04-20 19:23:28 -0400, Tom Lane wrote: >>> or are the HANDLEs in a Windows WaitEventSet not inheritable >>> resources? >> So that kind of sounds like it should be doable. > Ah, good. I'll add a comment about that and press on. So ... what would you sa

Re: [HACKERS] Unportable implementation of background worker start

2017-04-20 Thread Andres Freund

On 2017-04-20 00:50:13 -0400, Tom Lane wrote: > I wrote: > > Alvaro Herrera writes: > >> Tom Lane wrote: > >>> Hm. Do you have a more-portable alternative? > > >> I was thinking in a WaitEventSet from latch.c. > > My first reaction was that that sounded like a lot more work than removing > two

Re: [HACKERS] Unportable implementation of background worker start

2017-04-20 Thread Tom Lane

Andres Freund writes: > On 2017-04-20 19:23:28 -0400, Tom Lane wrote: >> or are the HANDLEs in a Windows WaitEventSet not inheritable >> resources? > I think we have control over that. According to > https://msdn.microsoft.com/en-us/library/windows/desktop/ms724466(v=vs.85).aspx > CreateProcess()

Re: [HACKERS] Unportable implementation of background worker start

2017-04-20 Thread Andres Freund

On 2017-04-20 19:23:28 -0400, Tom Lane wrote: > or are the HANDLEs in a Windows WaitEventSet not inheritable > resources? I think we have control over that. According to https://msdn.microsoft.com/en-us/library/windows/desktop/ms724466(v=vs.85).aspx CreateProcess() has to be called with bInheritHa

Re: [HACKERS] Unportable implementation of background worker start

2017-04-20 Thread Tom Lane

I wrote: > Alvaro Herrera writes: >> Tom Lane wrote: >>> Hm. Do you have a more-portable alternative? >> I was thinking in a WaitEventSet from latch.c. > My first reaction was that that sounded like a lot more work than removing > two lines from maybe_start_bgworker and adjusting some comments.

Re: [HACKERS] Unportable implementation of background worker start

2017-04-19 Thread Tom Lane

I wrote: > Alvaro Herrera writes: >> Tom Lane wrote: >>> Hm. Do you have a more-portable alternative? >> I was thinking in a WaitEventSet from latch.c. My first reaction was that that sounded like a lot more work than removing two lines from maybe_start_bgworker and adjusting some comments. Bu

Re: [HACKERS] Unportable implementation of background worker start

2017-04-19 Thread Tom Lane

Andres Freund writes: > FWIW, I'd wished before that we used something a bit more modern than > select() if available... It's nice to be able to listen to a larger > number of sockets without repeated O(sockets) overhead. [ raised eyebrow... ] Is anyone really running postmasters with enough lis

Re: [HACKERS] Unportable implementation of background worker start

2017-04-19 Thread Andres Freund

On 2017-04-19 18:56:26 -0400, Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> Hm. Do you have a more-portable alternative? > > > I was thinking in a WaitEventSet from latch.c. > > Yeah, some googling turns up the suggestion that a self-pipe is a portable > way to get consisten

Re: [HACKERS] Unportable implementation of background worker start

2017-04-19 Thread Tom Lane

Alvaro Herrera writes: > Tom Lane wrote: >> Hm. Do you have a more-portable alternative? > I was thinking in a WaitEventSet from latch.c. Yeah, some googling turns up the suggestion that a self-pipe is a portable way to get consistent semantics from select(); latch.c has already done that. I s

Re: [HACKERS] Unportable implementation of background worker start

2017-04-19 Thread Alvaro Herrera

Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> So I'm wondering what the design rationale was for only starting one > >> bgworker per invocation. > > > The rationale was that there may be other tasks waiting for postmaster > > attention, and if there are many bgworkers needing

Re: [HACKERS] Unportable implementation of background worker start

2017-04-19 Thread Tom Lane

Alvaro Herrera writes: > Tom Lane wrote: >> So I'm wondering what the design rationale was for only starting one >> bgworker per invocation. > The rationale was that there may be other tasks waiting for postmaster > attention, and if there are many bgworkers needing to be started, the > other wor

Re: [HACKERS] Unportable implementation of background worker start

2017-04-19 Thread Alvaro Herrera

Tom Lane wrote: > While I haven't yet tested it, it seems like a fix might be as simple > as deleting these lines in maybe_start_bgworker: > > /* > * Have ServerLoop call us again. Note that there might not > * actually *be* another runnable worker, but we d

[HACKERS] Unportable implementation of background worker start

2017-04-19 Thread Tom Lane

I chanced to notice that on gaur/pademelon, the "select_parallel" regression test sometimes takes a great deal longer than normal, for no obvious reason. It does eventually terminate, but sometimes only after 10 to 20 minutes rather than the couple dozen seconds that are typical on that slow machi

51 matches

Mail list logo