Re: [HACKERS] Repeatable read and serializable transactions see data committed after tx start

Álvaro Hernández Tortosa Tue, 04 Nov 2014 16:12:44 -0800


On 04/11/14 09:07, Craig Ringer wrote:

On 11/04/2014 07:31 AM, Álvaro Hernández Tortosa wrote:

     Thank you for your comment, Tom. However I think this behavior, as
seen from a user perspective, it's not the expected one.

That may be the case, but I think it's the SQL-standard behaviour, so we
can't really mess with it.


The spec requires SET TRANSACTION ISOLATION, and you can't implement
that if you take a snapshot at BEGIN.

It's true that the standard mandates SET TRANSACTION rather thansetting the isolation level with the BEGIN statement, and in any caseyou can raise/lower the isolation level with SET regardless of what thesession or the begin command said. However, is it really a problemtaking a snapshot at BEGIN time --only if the tx is started with BEGIN... (REPEATABLE READ | SERIALIZABLE)? AFAIK, and I may be missing someinternal details here, the worst that can happen is that you took oneextra, unnecessary snapshot. I don't see that as a huge problem.

The standard (92) says that transaction is initiated when atransaction-initiating SQL-statement is executed. To be fair, thatsounds to me more of a "SELECT" rather than a "BEGIN", but I may be wrong.

     If it is still the intended behavior, I think it should be clearly
documented as such, and a recommendation similar to "issue a 'SELECT 1'
right after BEGIN to freeze the data before any own query" or similar
comment should be added. Again, as I said in my email, the documentation
clearly says that "only sees data committed before the transaction
began". And this is clearly not the real behavior.

It's more of a difference in when the transaction "begins".

Arguably, "BEGIN" says "I intend to begin a new transaction with the
next query" rather than "immediately begin executing a new transaction".

This concept could be clearer in the docs.

If this is really how it should behave, I'd +1000 to make itclearer in the docs, and to explicitly suggest the user to perform aquery discarding the results early after BEGIN if the user wants thestate freezed if there may span time between BEGIN and the real queriesto be executed (like doing a SELECT 1).

     Sure, there are, that was the link I pointed out, but I found no
explicit mention to the fact that I'm raising here.

I'm sure it's documented *somewhere*, in that I remember reading about
this detail in the docs, but I can't find _where_ in the docs.

It doesn't seem to be in:

http://www.postgresql.org/docs/current/static/transaction-iso.html

where I'd expect.


    Yepp, there's no mention there.


In any case, we simply cannot take the snapshot at BEGIN time, because
it's permitted to:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

in a DB that has default serializable isolation or has a SET SESSION
CHARACTERISTICS isolation mode of serializable. Note that SET
TRANSACTION is SQL-standard.

As I said, AFAIK it shouldn't matter a lot to take the snapshot atBEGIN. The worst that can happen is that you end up in read committedand you need to take more snapshots, one per query.


AFAIK deferring the snapshot that's consistent with other RDBMSes that
use snapshots, too.

I tried Oracle and SQL Server. SQL Server seems to behave asPostgreSQL, but just because it locks the table if accessed in aserializable transaction, so it definitely waits until select to lockit. However, Oracle behaved as I expected: data is frozen at BEGIN time.I haven't tested others.



The docs of that command allude to, but doesn't explicitly state, the
behaviour you mention.

http://www.postgresql.org/docs/current/static/sql-set-transaction.html

Should we improve then the docs stating this more clearly? Anyobjection to do this?


    Regards,

    Álvaro


--
Álvaro Hernández Tortosa


-----------
8Kdata



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Repeatable read and serializable transactions see data committed after tx start

Reply via email to