SELECT FOR UPDATE has been discussed in this group before, but I don't
think it's been resolved whether it should be implemented in Django. I
think it should.

The idea is to avoid update collisions, i.e., the situation in which
one process is attempting to select a row and update it, and while
doing so, another process selects and updates the row in a manner
inconsistent with the what the first process is trying to do.

For example, I have an application in which a user submits a "job"
through the web interface (web app). This requires uploading a file
and queueing the job into a database row with status='ready'. A
separate "worker" application is looking for jobs that are 'ready', so
it queries the DB with SELECT and gets back the same row. The worker
app then updates the status to 'running' and goes off to handle the
job, using the uploaded file in the process.

But then the user decides to cancel the job, which is allowed so long
as the job status is still 'ready' but not yet 'running'. The user
clicks the 'cancel' button which in turn results in a SELECT for rows
with status='ready'. It so happens that the user makes the query after
the worker app SELECT is executed but before it UPDATEs the status to
'running'. Thus, both the web app and the worker app have the same
record and want to do inconsistent things. The web app wants to cancel
the job, so it deletes the file out from under the worker app and
updates status to 'deleted'. This causes grief for the worker app, so
it Exceptions out. No joy.

A better way would be to perform both SELECTS with the FOR UPDATE
clause inside of transactions. Suppose the worker app does this:

BEGIN;
SELECT * FROM <table> WHERE status='ready' FOR UPDATE;
UPDATE <table> SET status='running' WHERE status='ready';
COMMIT;

and concurrently, the web app does this:

BEGIN;
SELECT * FROM <table> WHERE status='ready' FOR UPDATE;
UPDATE <table> SET status='deleted' WHERE status='ready';
COMMIT;

The second SELECT to execute will have to wait until the first
transaction is over, at least with Postgresql, and then it will not
find a row because the status is no longer 'ready'. Thus, no collision
and all is well.

Update collision issues must be fairly common. I've seen discussions
in other forums. I've also done some looking around into the lower
levels of django.db.models and think "SELECT FOR UPDATE" could be
implemented with (1) a filter option that appends "FOR UPDATE" at the
end of the SELECT string, and (2) a save() with force_update=True.
But, I'm not all that familiar with how all those low level parts
interact. Any thoughts on how this could be implemented would be
appreciated.

Thanks,

--Tim

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to