Andres Freund writes:
> On 2017-04-27 16:35:29 -0400, Tom Lane wrote:
>> It looks like it might be sufficient to do "#ifdef EPOLL_CLOEXEC"
>> in latch.c, rather than bothering with a full-blown configure check.
> Yea, that sounds worth trying. Wonder if we need to care about kernels
> not suppor
On 2017-04-27 16:35:29 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2017-04-26 17:05:39 -0400, Tom Lane wrote:
> >> I went ahead and changed the call to epoll_create into epoll_create1.
> >> I'm not too concerned about loss of portability there --- it seems
> >> unlikely that many people
Andres Freund writes:
> On 2017-04-26 17:05:39 -0400, Tom Lane wrote:
>> I went ahead and changed the call to epoll_create into epoll_create1.
>> I'm not too concerned about loss of portability there --- it seems
>> unlikely that many people are still using ten-year-old glibc, and
>> even less lik
Robert Haas writes:
> On Wed, Apr 26, 2017 at 8:37 PM, Andres Freund wrote:
>>> 3. Go ahead with converting the postmaster to use WaitEventSet, a la
>>> the draft patch I posted earlier. I'd be happy to do this if we were
>>> at the start of a devel cycle, but right now seems a bit late --- not
On 2017-04-26 17:05:39 -0400, Tom Lane wrote:
> Here's an updated version of that, which makes use of our previous
> conclusion that F_SETFD/FD_CLOEXEC are available everywhere except
> Windows, and fixes some sloppy thinking about the EXEC_BACKEND case.
>
> I went ahead and changed the call to ep
On Wed, Apr 26, 2017 at 8:37 PM, Andres Freund wrote:
>> 3. Go ahead with converting the postmaster to use WaitEventSet, a la
>> the draft patch I posted earlier. I'd be happy to do this if we were
>> at the start of a devel cycle, but right now seems a bit late --- not
>> to mention that we real
On 2017-04-26 11:42:38 -0400, Tom Lane wrote:
> 1. Let HEAD stand as it is. We have a problem with slow response to
> bgworker start requests that arrive while ServerLoop is active, but that's
> a pretty tight window usually (although I believe I've seen it hit at
> least once in testing).
>
> 2.
Andres Freund writes:
> I'd still like to get something like your CLOEXEC patch applied
> independently however.
Here's an updated version of that, which makes use of our previous
conclusion that F_SETFD/FD_CLOEXEC are available everywhere except
Windows, and fixes some sloppy thinking about the
I wrote:
> =?utf-8?Q?R=C3=A9mi_Zara?= writes:
>> coypu was not stuck (no buildfarm related process running), but failed to
>> clean-up shared memory and semaphores.
>> I’ve done the clean-up.
> Huh, that's even more interesting.
I installed NetBSD 5.1.5 on an old Mac G4; I believe this is a rea
=?utf-8?Q?R=C3=A9mi_Zara?= writes:
>> Le 25 avr. 2017 à 01:47, Tom Lane a écrit :
>> It looks like coypu is going to need manual intervention (ie, kill -9
>> on the leftover postmaster) to get unwedged :-(. That's particularly
>> disturbing because it implies that ServerLoop isn't iterating at a
> Le 25 avr. 2017 à 01:47, Tom Lane a écrit :
>
> I wrote:
>> What I'm inclined to do is to revert the pselect change but not the other,
>> to see if that fixes these two animals. If it does, we could look into
>> blacklisting these particular platforms when choosing pselect.
>
> It looks like
I wrote:
> What I'm inclined to do is to revert the pselect change but not the other,
> to see if that fixes these two animals. If it does, we could look into
> blacklisting these particular platforms when choosing pselect.
It looks like coypu is going to need manual intervention (ie, kill -9
on
Andres Freund writes:
> On 2017-04-24 18:14:41 -0400, Tom Lane wrote:
>> A bit of googling establishes that NetBSD 5.1 has a broken pselect
>> implementation:
>>
>> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=43625
> Yikes. Do I understand correctly that they effectively just mapp
On 2017-04-24 18:14:41 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2017-04-24 17:33:39 -0400, Tom Lane wrote:
> >> coypu's problem is unrelated:
>
> > Note I was linking the 9.6 report form coypu, not HEAD. Afaics the 9.6
> > failure is the same as gharial's mode of failure.
>
> [ look
Andres Freund writes:
> On 2017-04-24 17:33:39 -0400, Tom Lane wrote:
>> coypu's problem is unrelated:
> Note I was linking the 9.6 report form coypu, not HEAD. Afaics the 9.6
> failure is the same as gharial's mode of failure.
[ looks closer... ] Oh: the 9.6 run occurred first, and the failure
On 2017-04-24 17:33:39 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2017-04-24 13:16:44 -0700, Andres Freund wrote:
> >> Unclear if related, but
> >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2017-04-24%2019%3A30%3A42
> >> has a suspicious timing of failing in a w
Andres Freund writes:
> On 2017-04-24 13:16:44 -0700, Andres Freund wrote:
>> Unclear if related, but
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2017-04-24%2019%3A30%3A42
>> has a suspicious timing of failing in a weird way.
> Given that gharial is also failing on 9.6 (
On 2017-04-24 13:16:44 -0700, Andres Freund wrote:
> Unclear if related, but
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2017-04-24%2019%3A30%3A42
> has a suspicious timing of failing in a weird way.
Given that gharial is also failing on 9.6 (same set of commits) and
coypu
On 2017-04-21 23:50:41 -0400, Tom Lane wrote:
> I wrote:
> > Attached is a lightly-tested draft patch that converts the postmaster to
> > use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions
> > about whether this is the direction to proceed, though.
>
> Attached are a couple of
On 2017-04-21 23:50:41 -0400, Tom Lane wrote:
> Attached are a couple of patches that represent a plausible Plan B.
> The first one changes the postmaster to run its signal handlers without
> specifying SA_RESTART. I've confirmed that that seems to fix the
> select_parallel-test-takes-a-long-time
I wrote:
> Attached is a lightly-tested draft patch that converts the postmaster to
> use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions
> about whether this is the direction to proceed, though.
Attached are a couple of patches that represent a plausible Plan B.
The first one c
Alvaro Herrera writes:
> Tom Lane wrote:
>> It also appears to me that do_start_bgworker's treatment of fork
>> failure is completely brain dead. Did anyone really think about
>> that case?
> Hmm, I probably modelled it on autovacuum without giving that case much
> additional consideration.
Att
Tom Lane writes:
> Andres Freund writes:
>> On 2017-04-21 14:08:21 -0400, Tom Lane wrote:
>>> but I see that SUSv2
>>> mandates that fcntl.h provide both F_SETFD and FD_CLOEXEC, so by our own
>>> coding rules it ought to be okay to assume they're there. I'm tempted to
>>> rip out the quoted bit,
Andres Freund writes:
> On 2017-04-21 14:08:21 -0400, Tom Lane wrote:
>> but I see that SUSv2
>> mandates that fcntl.h provide both F_SETFD and FD_CLOEXEC, so by our own
>> coding rules it ought to be okay to assume they're there. I'm tempted to
>> rip out the quoted bit, as well as the #ifdef F_
Tom,
On 2017-04-21 13:49:27 -0400, Tom Lane wrote:
> >> * If we are in PM_WAIT_DEAD_END state, then we don't want to accept
> >> - * any new connections, so we don't call select(), and just
> >> sleep.
> >> + * any new connections, so we don't call WaitEventSetWait(),
> >> an
On 2017-04-21 14:08:21 -0400, Tom Lane wrote:
> but I see that SUSv2
> mandates that fcntl.h provide both F_SETFD and FD_CLOEXEC, so by our own
> coding rules it ought to be okay to assume they're there. I'm tempted to
> rip out the quoted bit, as well as the #ifdef F_SETFD, from libpq and see
> i
I wrote:
> Andres Freund writes:
>> On 2017-04-21 12:50:04 -0400, Tom Lane wrote:
>>> +#ifndef FD_CLOEXEC
>>> +#define FD_CLOEXEC 1
>>> +#endif
>> Hm? Are we sure this is portable? Is there really cases that have
>> F_SETFD, but not CLOEXEC?
> Copied-and-pasted from our only existing use of FD_
Andres Freund writes:
> On 2017-04-21 12:50:04 -0400, Tom Lane wrote:
>> Attached is a lightly-tested draft patch that converts the postmaster to
>> use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions
>> about whether this is the direction to proceed, though. It adds at least
>
Alvaro Herrera writes:
> Tom Lane wrote:
>> I'm coming back to the idea that at least in the back branches, the
>> thing to do is allow maybe_start_bgworker to start multiple workers.
>>
>> Is there any actual evidence for the claim that that might have
>> bad side effects?
> Well, I ran tests w
Hi,
On 2017-04-21 12:50:04 -0400, Tom Lane wrote:
> Attached is a lightly-tested draft patch that converts the postmaster to
> use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions
> about whether this is the direction to proceed, though. It adds at least
> a couple of kernel cal
Attached is a lightly-tested draft patch that converts the postmaster to
use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions
about whether this is the direction to proceed, though. It adds at least
a couple of kernel calls per postmaster signal delivery, and probably to
every po
Tom Lane wrote:
> After sleeping and thinking more, I've realized that the
> slow-bgworker-start issue actually exists on *every* platform, it's just
> harder to hit when select() is interruptable. But consider the case
> where multiple bgworker-start requests arrive while ServerLoop is
> activel
Andres Freund writes:
> On 2017-04-20 00:50:13 -0400, Tom Lane wrote:
>> My first reaction was that that sounded like a lot more work than removing
>> two lines from maybe_start_bgworker and adjusting some comments. But on
>> closer inspection, the slow-bgworker-start issue isn't the only problem
On 2017-04-20 20:10:41 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2017-04-20 20:05:02 -0400, Tom Lane wrote:
> >> Also, if it's not there we'd fall back to using plain poll(), which is
> >> not so awful that we need to work hard to avoid it. I'd just as soon
> >> keep the number of com
Andres Freund writes:
> On 2017-04-20 20:05:02 -0400, Tom Lane wrote:
>> Also, if it's not there we'd fall back to using plain poll(), which is
>> not so awful that we need to work hard to avoid it. I'd just as soon
>> keep the number of combinations down.
> Just using fcntl(SET, CLOEXEC) wound'
On 2017-04-20 20:05:02 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2017-04-20 19:53:02 -0400, Tom Lane wrote:
> >> So ... what would you say to replacing epoll_create() with
> >> epoll_create1(EPOLL_CLOEXEC) ? Then a WaitEventSet would not
> >> represent inheritable-across-exec resource
Andres Freund writes:
> On 2017-04-20 19:53:02 -0400, Tom Lane wrote:
>> So ... what would you say to replacing epoll_create() with
>> epoll_create1(EPOLL_CLOEXEC) ? Then a WaitEventSet would not
>> represent inheritable-across-exec resources on any platform,
>> making it a lot easier to deal wit
On 2017-04-20 19:53:02 -0400, Tom Lane wrote:
> I wrote:
> > Andres Freund writes:
> >> On 2017-04-20 19:23:28 -0400, Tom Lane wrote:
> >>> or are the HANDLEs in a Windows WaitEventSet not inheritable
> >>> resources?
>
> >> So that kind of sounds like it should be doable.
>
> > Ah, good. I'll
I wrote:
> Andres Freund writes:
>> On 2017-04-20 19:23:28 -0400, Tom Lane wrote:
>>> or are the HANDLEs in a Windows WaitEventSet not inheritable
>>> resources?
>> So that kind of sounds like it should be doable.
> Ah, good. I'll add a comment about that and press on.
So ... what would you sa
On 2017-04-20 00:50:13 -0400, Tom Lane wrote:
> I wrote:
> > Alvaro Herrera writes:
> >> Tom Lane wrote:
> >>> Hm. Do you have a more-portable alternative?
>
> >> I was thinking in a WaitEventSet from latch.c.
>
> My first reaction was that that sounded like a lot more work than removing
> two
Andres Freund writes:
> On 2017-04-20 19:23:28 -0400, Tom Lane wrote:
>> or are the HANDLEs in a Windows WaitEventSet not inheritable
>> resources?
> I think we have control over that. According to
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms724466(v=vs.85).aspx
> CreateProcess()
On 2017-04-20 19:23:28 -0400, Tom Lane wrote:
> or are the HANDLEs in a Windows WaitEventSet not inheritable
> resources?
I think we have control over that. According to
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724466(v=vs.85).aspx
CreateProcess() has to be called with bInheritHa
I wrote:
> Alvaro Herrera writes:
>> Tom Lane wrote:
>>> Hm. Do you have a more-portable alternative?
>> I was thinking in a WaitEventSet from latch.c.
> My first reaction was that that sounded like a lot more work than removing
> two lines from maybe_start_bgworker and adjusting some comments.
I wrote:
> Alvaro Herrera writes:
>> Tom Lane wrote:
>>> Hm. Do you have a more-portable alternative?
>> I was thinking in a WaitEventSet from latch.c.
My first reaction was that that sounded like a lot more work than removing
two lines from maybe_start_bgworker and adjusting some comments. Bu
Andres Freund writes:
> FWIW, I'd wished before that we used something a bit more modern than
> select() if available... It's nice to be able to listen to a larger
> number of sockets without repeated O(sockets) overhead.
[ raised eyebrow... ] Is anyone really running postmasters with enough
lis
On 2017-04-19 18:56:26 -0400, Tom Lane wrote:
> Alvaro Herrera writes:
> > Tom Lane wrote:
> >> Hm. Do you have a more-portable alternative?
>
> > I was thinking in a WaitEventSet from latch.c.
>
> Yeah, some googling turns up the suggestion that a self-pipe is a portable
> way to get consisten
Alvaro Herrera writes:
> Tom Lane wrote:
>> Hm. Do you have a more-portable alternative?
> I was thinking in a WaitEventSet from latch.c.
Yeah, some googling turns up the suggestion that a self-pipe is a portable
way to get consistent semantics from select(); latch.c has already done
that. I s
Tom Lane wrote:
> Alvaro Herrera writes:
> > Tom Lane wrote:
> >> So I'm wondering what the design rationale was for only starting one
> >> bgworker per invocation.
>
> > The rationale was that there may be other tasks waiting for postmaster
> > attention, and if there are many bgworkers needing
Alvaro Herrera writes:
> Tom Lane wrote:
>> So I'm wondering what the design rationale was for only starting one
>> bgworker per invocation.
> The rationale was that there may be other tasks waiting for postmaster
> attention, and if there are many bgworkers needing to be started, the
> other wor
Tom Lane wrote:
> While I haven't yet tested it, it seems like a fix might be as simple
> as deleting these lines in maybe_start_bgworker:
>
> /*
> * Have ServerLoop call us again. Note that there might not
> * actually *be* another runnable worker, but we d
50 matches
Mail list logo