Excerpts from Tom Lane's message of lun sep 26 13:26:37 -0300 2011:
> 
> y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) writes:
> >> Maybe, but I'd still like to see a test case, because I can't reproduce
> >> any such problem by preparing ROLLBACK in an aborted transaction.
> 
> > reading GetTransactionSnapshot, it seems that the problem happens
> > only with IsolationUsesXactSnapshot() true.
> 
> Hmm.  I'm inclined to think that this demonstrates a bug in snapshot
> management, not so much in plancache.  We have plancache doing
> 
>         PushActiveSnapshot(GetTransactionSnapshot());
> 
> and then later
> 
>         PopActiveSnapshot();
> 
> and at this point surely it is not plancache's fault if there is any
> remaining refcount for the snapshot.  There is, though, because
> GetTransactionSnapshot saved a refcount in TopTransactionResourceOwner.
> I think it's snapmgr.c's responsibility to make sure that that's cleaned
> up, and it's not doing so.

Agreed.

> The place where that refcount normally gets dropped is
> AtEarlyCommit_Snapshot, but that isn't going to be called at all in
> aborted-transaction cleanup.  Worse, if we just transposed it over to be
> called in a place in AbortTransaction comparable to where it's called
> during commit, that still wouldn't fix the problem, because when the
> ROLLBACK happens, we've already aborted the transaction.

... ouch.

> I think that AtEarlyCommit_Snapshot is misdesigned, and that far from
> being done "early" in commit/abort, it needs to be done "late", like
> somewhere not very long before the
> ResourceOwnerDelete(TopTransactionResourceOwner) calls.  There is no
> very good reason to think that someone might not ask for a snapshot
> during commit processing.
> 
> Alvaro, do you happen to remember why this got designed as an "early"
> transaction shutdown action, rather than delaying it as long as
> possible?

As far as I remember, the only principle was that it had to run before
ResourceOwner cleanup.  Commit 7b640b0345dc4fbd39ff568700985b432f6afa07
introduces that "early" call; ResOwner support had been introduced 10
days before in 6bbef4e5383c99d93aa974e2c79d328cfbd1c4a9.  I probably
just tried it out and noticed that resowner.c complained if I didn't
drop the refcount prior to its own cleanup.

I don't think I ever considered the scenario of calls in aborted
transactions.

Shall I work on a fix?  I expect you are plenty busy with commitfest
stuff, but please let me know otherwise.

-- 
Álvaro Herrera <alvhe...@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to