Thomas Munro wrote: > On Wed, Nov 22, 2017 at 12:27 AM, atorikoshi > <torikoshi_atsushi...@lab.ntt.co.jp> wrote: > > [set_final_lsn_2.patch] > > Hi Torikoshi-san, > > FYI "make check" in contrib/test_decoding fails a couple of isolation > tests, one with an assertion failure for my automatic patch tester[1]. > Same result on my laptop: > > test ondisk_startup ... FAILED (test process exited with exit code > 1) > test concurrent_ddl_dml ... FAILED (test process exited with exit code > 1) > > TRAP: FailedAssertion("!(!dlist_is_empty(head))", File: > "../../../../src/include/lib/ilist.h", Line: 458)
I observed a couple of crashes too a couple of times, while testing this patch. But I have seen several completely different crashes. This crash you show I have not been able to reproduce, though I've run this in 94 and master many times. For example, I got a backtrace that looks like this in 9.6: #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007f19ccb913fa in __GI_abort () at abort.c:89 #2 0x000055e7511f451b in errfinish (dummy=<optimized out>) at /pgsql/source/REL9_6_STABLE/src/backend/utils/error/elog.c:557 #3 0x000055e750ed732b in XLogFileInit (logsegno=1, use_existent=use_existent@entry=0x7ffdbc34ab6f "\001\002", use_lock=use_lock@entry=1 '\001') at /pgsql/source/REL9_6_STABLE/src/backend/access/transam/xlog.c:3023 #4 0x000055e750edb227 in XLogWrite (WriteRqst=..., flexible=flexible@entry=0 '\000') at /pgsql/source/REL9_6_STABLE/src/backend/access/transam/xlog.c:2258 #5 0x000055e750ee162d in XLogBackgroundFlush () at /pgsql/source/REL9_6_STABLE/src/backend/access/transam/xlog.c:2894 then in 9.4 I saw this one: creating information schema ... ok loading PL/pgSQL server-side language ... ok vacuuming database template1 ... ok copying template1 to template0 ... FATAL: could not open directory "pg_logical/snapshots": No such file or directory STATEMENT: CREATE DATABASE template0; WARNING: could not remove file or directory "base/12148": No such file or directory WARNING: some useless files may be left behind in old database directory "base/12148" FATAL: could not access status of transaction 0 DETAIL: Could not open file "pg_clog/0000": No such file or directory. child process exited with exit code 1 What this indicates to me is that perhaps the test harness is doing stupid things such as running two servers concurrently in the same datadir, so they overwrite one another. If I take out the "-j2" from make, this no longer reproduces. Therefore, I'm going to push this patch shortly because clearly this problem is not its fault. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services