On Saturday 28 September 2013 09:35:06 Aymeric Augustin wrote:
> Hello,
>
> This is a follow-up to « Is "transaction.atomic" in 1.6 supposed to work
> this way? ». Two options seem viable to me. We must chose one before
> releasing 1.6 RC1.
>
> Other options have been proposed but I couldn't make them work
> satisfactorily. In order to keep this discussion manageable, if you'd like
> to propose a new idea, please make sure it can be implemented and pass
> Django's test suite first, so we can discuss working code.
>
>
> ** The problem **
>
> We're talking about the following scenario:
>
> 1) The program enters an `atomic` block (ie. Django begins a transaction or
> creates a savepoint). 2) At some point an exception is thrown and the
> transaction is marked for rollback. a) This is expected when a
> DatabaseError, IntegrityError, etc. occurs. b) It may also happen in
> scenarios involving ORM-related signals and probably others I haven't
> discovered yet. 3) The program catches the exception.
> 4) The program attempts to execute SQL queries.
> 5) The program exits the `atomic` block with a rollback (ie. Django rolls
> back the transaction or to the savepoint)
>
> Here's the behavior of the database at step 4 :
> - PostgreSQL doesn't allow running queries after an error has occurred in a
> transaction. It raises an exception. - MySQL and Oracle have rolled back
> the faulty statement and allow running queries. This is called
> statement-level atomicity, and it makes it possible to commit a transaction
> even though not all statements have executed. This behavior is…
> interesting… (did someone just shout "foot gun" at the back of the room?)
> but it exists and developers are using it. - SQLite's behavior is
> undefined. It may raise an error like PostgreSQL, roll back the last
> statement like MySQL, or roll back the entire transaction.
>
>
> ** Option 1 — keep the current behavior **
>
> Currently, `atomic` doesn't attempt to normalize the behavior in the
> scenario described above. The actual behavior depends on the underlying
> database. This approach has two drawbacks.
>
> A) If the exception that originally triggered the rollback isn't related to
> the database, it's possible to run queries. However, their effect will be
> lost during the final rollback at step 5. Developers may be surprised by
> this rollback. It doesn't break the fundamental contract of `atomic`
> (atomicity) but it's unexpected.
>
> This is clearly an abstraction leak: `atomic` manages database transactions
> through Python exceptions, but the two can get out of sync. Technically, it
> can be traced to the `savepoint=False` option of `atomic`, but that's a red
> herring, and I don't think we can do much better without either
> complicating the `atomic` API or adding a large performance overhead.
>
> B) If the exception was a database error, but the database allows running
> queries after an error in a transaction (MySQL, Oracle), developers who
> attempt to handle the error properly will be very surprised that a rollback
> still happens.
>
> It looks like both problems can be solved with a simple guideline: catch
> errors "around" an `atomic` block rather than "inside". This ensures that
> the transaction is rolled back to an usable state before you handle the
> exception. One may have to insert an extra `atomic` block for this purpose.
> If we were only concerned with database errors (2a), this would be
> manageable. Unfortunately, there's also (2b); Anssi provided some valid
> examples. Python being a dynamic language, we can't exclude other
> scenarios. So, while the guideline is simple in theory, it isn't realistic
> to apply it to every try/catch block.
>
>
> ** Option 2 — prevent running queries in broken transactions **
>
> `atomic` would normalize the behavior across databases by raising an
> exception at step 4. This approach is implemented in
> https://github.com/django/django/pull/1659 (still needs docs). It also has
> a few drawbacks.
>
> C) It enforces PostgreSQL's behavior across all databases. Django tries to
> be database agnostic, except that when it has to make a choice, it always
> chooses PostgreSQL's behavior, which isn't nice to developers who favor
> other databases.
>
> D) It triggers false positives. I had to change several tests in Django's
> test suite to make them pass:
> https://github.com/aaugustin/django/commit/a08e5dd403c40ae12c77a677e52997a7
> 3bd09dbb Besides, some statements are allowed in broken transactions, even
> under PostgreSQL. For instance ROLLBACK TO SAVEPOINT is allowed! I had to
> resort to a pretty bad hack to allow nesting `constraint_checks_disabled`
> (a MySQL-specific feature) inside `atomic`:
> https://github.com/aaugustin/django/commit/9e50dadf14adac92f1faf0a928df3fac
> 032bddc6
>
> E) Developers will learn `set_rollback(False)` which allows preventing the
> rollback. Misusing it can result in really hard to diagnose errors. It
> don't find it safe for general consumption. (I have nothing against
> `set_rollback(True)` to force a rollback without raising an exception.
>
>
> ** Recommendation **
>
> I'm +0 for option 2 for the following reasons:
>
> - failing loudly is better than failing silently
> - detecting problems at run time is better than requiring code changes
> (forgiveness vs. permission) - solving design problems with documentation
> doesn't fly
>
> Thoughts?
Just a note -- option 2 has the potential (like PostgreSQL's error handling)
to hide the problems and make them very hard to debug. Consider a site using
ATOMIC_REQUESTS (equivalently, TransactionMiddleware) on a statement-atomicity
database, whose code catches exceptions and continues working. While Option 2
will detect the problems, the error report will only point at the query after
the problem, not the place that needs to be fixed. This is a very frustrating
experience: "You have an error somewhere before this point, go find it".
If Option 2 is to be viable, it needs to be modified to store the original
problem and its traceback, and add them to the errors somehow. On Python 3,
exception chaining can be used; on Python 2, some other method needs to be
found.
pseudo test:
def test_real_problem_reported(self):
with transaction.atomic():
try:
# Break the transaction
cursor.execute('KAWABANGA!')
except DatabaseError:
pass
try:
# Perform another query. This should fail...
list(User.objects.all())
self.fail("Transaction management compromised")
except TransactionManagementError:
# ... but the original offending statement should be
mentioned
text =
concat_all_relevant_exc_attrs_including_tracebacks()
self.assertIn("cursor.execute('KAWABANGA!')", text)
Shai.
--
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-developers.
For more options, visit https://groups.google.com/groups/opt_out.