On Thu, Feb 16, 2012 at 5:02 AM, Heikki Linnakangas <heikki.linnakan...@enterprisedb.com> wrote: > On 15.02.2012 18:52, Fujii Masao wrote: >> >> On Thu, Feb 16, 2012 at 1:01 AM, Heikki Linnakangas >> <heikki.linnakan...@enterprisedb.com> wrote: >>> >>> Are you still seeing this failure with the latest patch I posted >>> >>> (http://archives.postgresql.org/message-id/4f38f5e5.8050...@enterprisedb.com)? >> >> >> Yes. Just to be safe, I again applied the latest patch to HEAD, >> compiled that and tried >> the same test. Then unfortunately I got the same failure again. > > > Ok. > >> I ran the configure with '--enable-debug' '--enable-cassert' >> 'CPPFLAGS=-DWAL_DEBUG', >> and make with -j 2 option. >> >> When I ran the test with wal_debug = on, I got the following assertion >> failure. >> >> LOG: INSERT @ 0/17B3F90: prev 0/17B3F10; xid 998; len 31: Heap - >> insert: rel 1663/12277/16384; tid 0/197 >> STATEMENT: create table t (i int); insert into t >> values(generate_series(1,10000)); delete from t >> LOG: INSERT @ 0/17B3FD0: prev 0/17B3F50; xid 998; len 31: Heap - >> insert: rel 1663/12277/16384; tid 0/198 >> STATEMENT: create table t (i int); insert into t >> values(generate_series(1,10000)); delete from t >> TRAP: FailedAssertion("!(((bool) (((void*)(&(target->tid)) != ((void >> *)0))&& ((&(target->tid))->ip_posid != 0))))", File: "heapam.c", >> >> Line: 5578) >> LOG: xlog bg flush request 0/17B4000; write 0/17A6000; flush 0/179D5C0 >> LOG: xlog bg flush request 0/17B4000; write 0/17B0000; flush 0/17B0000 >> LOG: server process (PID 16806) was terminated by signal 6: Abort trap >> >> This might be related to the original problem which Jeff and I saw. > > > That's strange. I made a fresh checkout, too, and applied the patch, but > still can't reproduce. I used the attached script to test it. > > It's surprising that the crash happens when the records are inserted, not at > recovery. I don't see anything obviously wrong there, so could you please > take a look around in gdb and see if you can get a clue what's going on? > What's the stack trace?
According to the above log messages, one strange thing is that the location of the WAL record (i.e., 0/17B3F90) is not the same as the previous location of the following WAL record (i.e., 0/17B3F50). Is this intentional? BTW, when I ran the test on my Ubuntu, I could not reproduce the problem. I could reproduce the problem only in MacOS. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers