Hi,

On 2021-03-02 21:20:11 -0800, Andres Freund wrote:
> On 2021-03-02 12:57:57 -0800, Andres Freund wrote:
> > t/003_recovery_targets.pl ............ 7/9
> > #   Failed test 'multiple conflicting settings'
> > #   at t/003_recovery_targets.pl line 151.
> > 
> > #   Failed test 'recovery end before target reached is a fatal error'
> > #   at t/003_recovery_targets.pl line 177.
> > t/003_recovery_targets.pl ............ 9/9 # Looks like you failed 2 tests 
> > of 9.
> > t/003_recovery_targets.pl ............ Dubious, test returned 2 (wstat 512, 
> > 0x200)
> > Failed 2/9 subtests
> 
> This appears to be caused by stderr in windows docker containers to
> somehow not work quite right. cirrus-ci uses docker on windows.
> 
> If you look e.g. at https://cirrus-ci.com/task/6111560255930368, and
> specifically at the relevant log file:
> https://api.cirrus-ci.com/v1/artifact/task/6111560255930368/log/src/test/recovery/tmp_check/log/003_recovery_targets_primary.log
> you can see that it's, uh, less full than we normally expect:
>         1 file(s) copied.
>         1 file(s) copied.
>         1 file(s) copied.
>         1 file(s) copied.
> 
> As that test uses the log file to determine the state of servers:
> > my $logfile = slurp_file($node_standby->logfile());
> > ok($logfile =~ qr/multiple recovery targets specified/,
> >         'multiple conflicting settings');
> 
> that doesn't work.
> 
> 
> I was *very* confused by this for a while. But finally the cluebait hit
> when I discovered that stderr works just fine for *other*
> programs. Including the programs that evidently log into
> 003_recovery_targets_primary.log.  The problem is that
> pgwin32_is_service() somehow decides that postgres is running as a
> service. Despite that not really being the case (I guess somehow
> internally docker containers are started below a service, and that
> causes the problem).
> 
> I hate everything right now. So much.
> 
> I think it's quite nasty that postgres just silently starts to log to
> the event log. Why on earth wasn't the solution instead to hardcode that
> as a server parameter in pg_ctl register?
> 
> Not sure what a good fix is for this.

FWIW, just forcing pgwin32_is_service() to return false seems to get the
cirrus tests past 003_recovery_targets.pl. Possible it'll not finish due
to other problems (or too tight timeouts I set), but at least this one
can be considered diagnosed I think.

https://cirrus-ci.com/task/5049764917018624?command=windows_worker_buf#L132

Greetings,

Andres Freund


Reply via email to