Hello
The problem was, that get_or_create() was called within a transaction and at
the end of the transaction the INSERT caused IntegrityError, as in the meantime
another transaction has finished, that inserted the same row but with a
different id. I have wrongly concluded, that repeating get_or_create() would
help. By coincidence, after inserting a second get_or_create() I have never
seen the problem again.
That said, I do think that doing it in one single atomic query has advantages
(for one, the above is still not atomic, though the failure case in the end is
much less likely). But a complicated postgres-specific query is perhaps not
worth it. If you can think of a generic method, that works with READ COMMITTED,
that would be interesting. Some conditions I would personally impose are:
1. Should not increment the primary key sequence in the `get` case. This is not
critical for correctness, I just like to avoid these holes :)
What I proposed does not increment the sequence.
2. The `get` case should remain fast. In the current code, `get_or_create` is
the same as `get` when the row already exists. In many (most?) scenarios this
is the 99% case.
I think my proposal is also quite fast, assuming that WHERE NOT EXISTS(SELECT
* FROM -already fetched tabled-) is fast, and UNION ALL with with two tables,
having a total of one row, is also fast.
Regards
Dilian
On 10/12/17 19:04, Ran Benita wrote:
Have you drilled down to `self._create_object_from_params(params)`? It does
handle this case, as follows:
try:
with transaction.atomic(using=self.db):
params = {k: v() if callable(v) else v for k, v in params.items()}
obj = self.create(**params)
return obj, True
except IntegrityError:
exc_info = sys.exc_info()
try:
return self.get(**lookup), False
except self.model.DoesNotExist:
pass
raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])
That said, I do think that doing it in one single atomic query has advantages
(for one, the above is still not atomic, though the failure case in the end is
much less likely). But a complicated postgres-specific query is perhaps not
worth it. If you can think of a generic method, that works with READ COMMITTED,
that would be interesting. Some conditions I would personally impose are:
1. Should not increment the primary key sequence in the `get` case. This is not
critical for correctness, I just like to avoid these holes :)
2. The `get` case should remain fast. In the current code, `get_or_create` is
the same as `get` when the row already exists. In many (most?) scenarios this
is the 99% case.
--
You received this message because you are subscribed to the Google Groups "Django
developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-developers/bf29fda0-a6e5-2ac1-670e-772bbc2c9bd0%40aegee.org.
For more options, visit https://groups.google.com/d/optout.