On Mon, Jun 15, 2015 at 3:37 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Buildfarm member hamster has failed a pretty significant fraction of > its recent runs in the BinInstallCheck step: > http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=hamster&br=HEAD > > Since other critters aren't equally distressed, it seems likely that > this is just an out-of-disk-space type of problem. But maybe it's > trying to tell us that there's a genuine platform-specific bug there. > In any case, I challenge anyone to figure out what's happening from > the information available from the buildfarm logs. > > I don't know whether this is just that the buildfarm script isn't > collecting data that it should be. But my experiences with the > TAP test scripts haven't been very positive. When they fail, it > takes a lot of digging to find out why. Basically, that entire > mechanism sucks as far as debuggability is concerned.
Indeed. I think that one step in the good direction would be to replace all the calls to system and system_or_bail with a wrapper routine that calls IPC::Run able to catch the logs and store those logs in each test's base path. The same applies to pg_rewind tests. > I think there is a good argument for turning this off in the buildfarm > until there is a better way of identifying and solving problems. It is > not helping us that hamster is red half the time for undiscoverable > reasons. That just conditions people to ignore it, and it may well be > masking real problems that the machine could be finding if it weren't > failing at this step. hamster is legendary slow and has a slow disc, hence it improves chances of catching race conditions, and it is the only slow buildfarm machine enabling the TAP tests (by comparison dangomushi has never failed with the TAP tests) hence I would prefer thinking that the problem is not specific to ArchLinux ARM. In this case the failure seems to be related to the timing test servers stop and start even if -w switch is used with pg_ctl, particularly that PGPORT is set to the same value for all servers... Still, for the time being I don't mind disabling them and just did so now. I will try to investigate further on the machine itself. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers