On 2017-08-30 02:11:10 +0200, Tomas Vondra wrote: > > > On 08/30/2017 01:34 AM, Andres Freund wrote: > > Hi, > > > > On 2017-08-30 01:27:34 +0200, Tomas Vondra wrote: > >> I've been investigating some failures in test_decoding regression tests, > >> and it seems to me the error-handling in ReorderBufferCommit() is > >> somewhat broken, leading to segfault crashes. > >> > >> The problematic part is this: > >> > >> PG_CATCH() > >> { > >> /* > >> * Force cache invalidation to happen outside of a valid transaction > >> * to prevent catalog access as we just caught an error. > >> */ > >> AbortCurrentTransaction(); > >> > >> /* make sure there's no cache pollution */ > >> ReorderBufferExecuteInvalidations(rb, txn); > >> > >> ... > >> > >> } > >> > >> Triggering it trivial - just add elog(ERROR,...) at the beginning of the > >> PG_TRY() block. > > > > That's not really a valid thing to do, you should put it after the > > BeginInternalSubTransaction()/StartTransactionCommand(). It's basically > > assumed that those won't fail - arguably they should be outside of a > > PG_TRY then, but that's a different matter. If you start decoding > > outside of SQL failing before those will lead to rolling back the parent > > tx... > > > > > >> I suppose this is not quite intentional, but rather an example that > >> error-handling code is an order of magnitude more complicated to write > >> and test. I've only noticed as I'm investigating some regression > >> failures on Postgres-XL 10, which does not support subtransactions and > >> so the BeginInternalSubTransaction() call in the try branch always > >> fails, triggering the issue. > > > > So, IIUC, there's no live problem in postgres core, besides some ugly & > > undocumented assumptions? > > > > I'm not really following your reasoning. You may very well be right that > the BeginInternalSubTransaction() example is not quite valid on postgres > core, but I don't see how that implies there can be no other errors in > the PG_TRY block. It was merely an explanation how I noticed this issue. > > To make it absolutely clear, I claim that the PG_CATCH() block is > demonstrably broken as it calls AbortCurrentTransaction() and then > accesses already freed memory.
Yea, but that's not what happens normally. The txn memory isn't allocated in the context created by the started [sub]transaction. I think you're just seeing what you're seeing because an error thrown before the BeginInternalSubTransaction() succeeds, will roll back the *containing* transaction (the one that did the SQL SELECT * FROM pg_*logical*() call), and free all the entire reorderbuffer stuff. > The switch in the PG_TRY() blocks includes multiple elog(ERROR) calls in > the switch, and AFAICS hitting any of them will have exactly the same > effect as failure in BeginInternalSubTransaction(). No, absolutely not. Once the subtransaction starts, the AbortCurrentTransaction() will abort that subtransaction, rather than the containing one. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers