On Fri, May 07, 2021 at 01:18:19PM -0400, Tom Lane wrote:
> Realizing that 9989d37d prevents the assertion failure, I went
> to see if thorntail had shown EIO failures without assertions.
> Looking back 180 days, I found these:
> 
>   sysname  |    branch     |      snapshot       |       stage        |       
>                                                                 l             
>                                                            
> -----------+---------------+---------------------+--------------------+------------------------------------------------------------------------------------------------------------------------------------------------
>  thorntail | HEAD          | 2021-03-19 21:28:15 | recoveryCheck      | 
> 2021-03-20 00:48:48.117 MSK [4089174:11] 008_fsm_truncation.pl PANIC:  could 
> not fdatasync file "000000010000000000000002": Input/output error
>  thorntail | HEAD          | 2021-04-06 16:08:10 | recoveryCheck      | 
> 2021-04-06 19:30:54.103 MSK [3355008:11] 008_fsm_truncation.pl PANIC:  could 
> not fdatasync file "000000010000000000000002": Input/output error
>  thorntail | REL9_6_STABLE | 2021-04-12 02:38:04 | pg_basebackupCheck | 
> pg_basebackup: could not fsync file "000000010000000000000013": Input/output 
> error
> 
> So indeed the kernel-or-hardware problem is affecting other branches.

Having a flaky buildfarm member is bad news.  I'll LD_PRELOAD the attached to
prevent fsync from reaching the kernel.  Hopefully, that will make the
hardware-or-kernel trouble unreachable.  (Changing 008_fsm_truncation.pl
wouldn't avoid this, because fsync=off doesn't affect syncs outside the
backend.)
/* gcc -fPIC -shared never_sync.c -o never_sync.so */

int
fsync(int fd)
{
    return 0;
}
int
fdatasync(int fd)
{
    return fsync(fd);
}

Reply via email to