Re: buildfarm instance bichir stuck

2023-07-29 Thread Thomas Munro
On Fri, Apr 9, 2021 at 6:11 PM Thomas Munro wrote: > On Wed, Apr 7, 2021 at 7:31 PM Robins Tharakan wrote: > > Correct. This is easily reproducible on this test-instance, so let me know > > if you want me to test a patch. > > From your description it sounds like signals are not arriving at all,

Re: buildfarm instance bichir stuck

2021-04-08 Thread Robins Tharakan
On Fri, 9 Apr 2021 at 16:12, Thomas Munro wrote: > From your description it sounds like signals are not arriving at all, > rather than some more complicated race. Let's go back to basics... > what does the attached program print for you? I see: > > tmunro@x1:~/junk$ cc test-signalfd.c > tmunro@x

Re: buildfarm instance bichir stuck

2021-04-08 Thread Thomas Munro
On Wed, Apr 7, 2021 at 7:31 PM Robins Tharakan wrote: > Correct. This is easily reproducible on this test-instance, so let me know if > you want me to test a patch. >From your description it sounds like signals are not arriving at all, rather than some more complicated race. Let's go back to ba

Re: buildfarm instance bichir stuck

2021-04-07 Thread Tom Lane
Andrew Dunstan writes: > On 4/7/21 4:02 PM, Tom Lane wrote: >> On further thought, that doesn't seem like the place to fix it. >> I'd rather be able to ask the buildfarm server to send me nagmail >> if my animal hasn't sent a report in N days (where N had better >> be owner-configurable). > That

Re: buildfarm instance bichir stuck

2021-04-07 Thread Andrew Dunstan
On 4/7/21 4:02 PM, Tom Lane wrote: > Andrew Dunstan writes: >> On 4/7/21 1:07 PM, Tom Lane wrote: >>> I do use it on some of my flakier dinosaurs, and I've noticed that >>> when it does kick in, the buildfarm run just stops dead and no report >>> is sent to the BF server. That has advantages in

Re: buildfarm instance bichir stuck

2021-04-07 Thread Tom Lane
Andrew Dunstan writes: > On 4/7/21 1:07 PM, Tom Lane wrote: >> I do use it on some of my flakier dinosaurs, and I've noticed that >> when it does kick in, the buildfarm run just stops dead and no report >> is sent to the BF server. That has advantages in not cluttering the >> BF status with run-f

Re: buildfarm instance bichir stuck

2021-04-07 Thread Andrew Dunstan
On 4/7/21 1:07 PM, Tom Lane wrote: > Robins Tharakan writes: >> Not sure if many agree but 2 things stood out here: >> 1) Buildfarm never got the message that a commit broke an instance. Ideally >> I'd have expected buildfarm to have an optimistic timeout that could have >> helped - for e.g. rig

Re: buildfarm instance bichir stuck

2021-04-07 Thread Tom Lane
Robins Tharakan writes: > Not sure if many agree but 2 things stood out here: > 1) Buildfarm never got the message that a commit broke an instance. Ideally > I'd have expected buildfarm to have an optimistic timeout that could have > helped - for e.g. right now, the CREATE DATABASE is still stuck

Re: buildfarm instance bichir stuck

2021-04-07 Thread Robins Tharakan
Thanks Andrew. The build's still running but the CPPFLAGS hint does seem to have helped (see below). Unless advised otherwise, I intend to let that option be, so as to get bichir back online. If a future commit 'fixes' things, I could rollback this flag to test things out (or try out other option

Re: buildfarm instance bichir stuck

2021-04-07 Thread Andrew Dunstan
On 4/7/21 2:16 AM, Thomas Munro wrote: > On Wed, Apr 7, 2021 at 5:44 PM Robins Tharakan wrote: >> Bichir's been stuck for the past month and is unable to run regression tests >> since 6a2a70a02018d6362f9841cc2f499cc45405e86b. > Hrmph. That's "Use signalfd(2) for epoll latches." I had a simila

Re: buildfarm instance bichir stuck

2021-04-07 Thread Robins Tharakan
Hi Thomas, Thanks for taking a look at this promptly. On Wed, 7 Apr 2021 at 16:17, Thomas Munro wrote: > On Wed, Apr 7, 2021 at 5:44 PM Robins Tharakan wrote: > > It is interesting that that commit's a month old and probably no other client has complained since, but diving in, I can see that i

Re: buildfarm instance bichir stuck

2021-04-06 Thread Thomas Munro
On Wed, Apr 7, 2021 at 5:44 PM Robins Tharakan wrote: > Bichir's been stuck for the past month and is unable to run regression tests > since 6a2a70a02018d6362f9841cc2f499cc45405e86b. Hrmph. That's "Use signalfd(2) for epoll latches." I had a similar report from an illumos user (but it was inte

buildfarm instance bichir stuck

2021-04-06 Thread Robins Tharakan
Hi, Bichir's been stuck for the past month and is unable to run regression tests since 6a2a70a02018d6362f9841cc2f499cc45405e86b. It is interesting that that commit's a month old and probably no other client has complained since, but diving in, I can see that it's been unable to even start regress