On Tue, 2009-05-12 at 10:47 -0700, Phil Mocek wrote:
> On Tue, May 12, 2009 at 02:25:41AM -0700, Daniel Roseman wrote:
> > No, [get_or_create is] not atomic.

By the way, that depends on what grained-ness of atomicity you're after,
which hasn't been defined in this thread. The guarantee is that after
get_or_create() returns, the object will exist in the database and you
will be returned that object. It always does that (or an error is
raised). Guaranteed. That makes it atomic on that level. Will it always
create a new object if the object doesn't exist at the precise moment
the function is called? No. Not atomic on that granularity, but that's
rarely a problem in practice (in fact, despite claims to the contrary,
the "optimistic locking" approach taken by get_or_create() *helps*
scaling).

>  You can see the code in
> > django.db.models.query - it tries a db lookup, and then creates
> > a new object if one is not found.
> 
> It seems that this creates a potentially-troublesome race
> condition.  Wouldn't the object creation fail if the object is
> created by some other process between the time of the query and
> the time of the attempt at creation? 

Please read the code. This situation is accounted for.

>  Shouldn't any operation that
> relies upon the results of a database query require that a read
> lock (i.e., a shared or non-exclusive lock) be placed before that
> initial query and held until the operation has successfully
> completed?

No.

get_or_create() is used in the (very common) situation where the thing
you are creating is going to have essentially the same contents no
matter when you create it (having, e.g, a different creation time
doesn't factually alter that situation). So it doesn't matter when you
create it, providing it actually exists.

> 
> This is rather disturbing.  Are there other instances of Django's
> ORM doing things that are not safe for concurrent database access?
> It seems that this would be a serious hindrance to scalability.

It's not. In fact, it helps scalability because multiple threads of
operation can safely call get_or_create() and the guarantee of the
function is fulfilled: it will return the object and it will exist in
the database at that point. This is optimistic locking (lock as late and
as low-level as possible -- in this case at the db server level), vs the
pessimistic locking case of preemptively locking things up for no great
gain.

Regards,
Malcolm


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to