Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-25 Thread Tom Lane
I wrote: > My present theory about the cause is that the backend lost its CPU > quantum immediately after doing the send() that responded to the last > INSERT, and was interrupted by SIGQUIT before it could continue. > If you look at pq_flush() you'll see that it does not reset > PqSendPointer unti

Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-25 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes: > On 9/24/2004 10:24 PM, Tom Lane wrote: >> If you can actually prove that a *different session* was able to see as >> committed data that was not safely committed, then we have another >> problem to look for. I am hoping we have only one nasty bug today ;-)

Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-24 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes: > I guess nobody ever relied that heavily on data to be persistent at the > microsecond the NOTIFY arrives ... Sure they have. In theory you cannot see a NOTIFY before the sending transaction commits, because the sender is holding a lock on pg_notify and you

Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-24 Thread Tom Lane
I said: > Oh, fooey. > exec_simple_query calls EndCommand before it calls finish_xact_command, Fooey again --- that theory is all wrong. Back to the drawing board. I have managed to reproduce the bug on CVS tip, btw. But it's very painful to make it happen. Have you got any tips for making it

Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-24 Thread Jan Wieck
On 9/24/2004 10:24 PM, Tom Lane wrote: Jan Wieck <[EMAIL PROTECTED]> writes: Now the scary thing is that not only did this crash rollback a committed transaction. Another session had enough time in between to receive a NOTIFY and select the data that got rolled back later. Different session, or s

Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-24 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes: > Now the scary thing is that not only did this crash rollback a committed > transaction. Another session had enough time in between to receive a > NOTIFY and select the data that got rolled back later. Different session, or same session? NOTIFY is one of t

Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-24 Thread Jan Wieck
On 9/24/2004 6:37 PM, Tom Lane wrote: Can you still reproduce the problem if you take out the ereport call in quickdie()? Will check ... BTW, what led you to develop this test setup ... had you already seen something that made you suspect a data loss problem? Good guess ... what actually happenend

Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-24 Thread Tom Lane
>> This means either that the server sent a commit message before it had >> xlog'd the commit, or that Pgtcl mistakenly reported the command as >> successful when it was not. Any thoughts? Oh, fooey. exec_simple_query calls EndCommand before it calls finish_xact_command, and of course the latter

Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-24 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes: > Is it somehow possible that the commit record was still sitting in the > shared WAL buffers (unwritten) when the response got sent to the client? I don't think so. What I see in the two cases I have now are: (1) The backend that was doing the "lost" tran

Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-24 Thread Jan Wieck
On 9/24/2004 5:12 PM, Tom Lane wrote: This means either that the server sent a commit message before it had xlog'd the commit, or that Pgtcl mistakenly reported the command as successful when it was not. Any thoughts? Is it somehow possible that the commit record was still sitting in the shared W

Re: [HACKERS] 7.4.5 losing committed transactions

2004-09-24 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes: > But occasionally there will appear a gap in the data. With the given > logic only to increment the counter on a dupkey or after a positive > COMMIT response by the backend, IMHO there can only be one if we lose > transactions after commit on a crash restar

[HACKERS] 7.4.5 losing committed transactions

2004-09-24 Thread Jan Wieck
The attached archive contains a script that I used to reproduce the error multiple times. Setup: * create database crashtest * start 6 instances of testload.tcl as ./testload.tcl tN dbname=crashtest where N = 1..6 * frequently kill a backend to cause a postmaster restart. The te