you're using the "old" scheduler. the new one operates on a totally different logic and it was coded with scalability in mind. you can find the relevant code in trunk, the gluon/scheduler.py file is what are you looking for.
PS: there is an app to show off the new scheduler behaviour (and shortcuts) at http://github.com/niphlod/w2p_scheduler_tests PS2: you can start 7 workers by launching web2py.py -K milo,milo,milo,milo,milo,milo,milo. PS3: your code to enqueue the task can be simplified to ... db.scheduler_tasks.insert( task_name='schedule_movie', function_name='import_or_update_movie', args=json(*args), vars=json(*vars), timeout=3600 ) ... On Wednesday, August 1, 2012 8:21:43 PM UTC+2, Vincenzo Ampolo wrote: > > Hi, > > I'm writing an application that uses the scheduler heavily. I create new > task using this simple function (there is a commit() because it's in an > external script): > > def schedule_movie(*args, **kwargs): > db_scheduler.scheduler_task.insert( > status='QUEUED', > application_name='milo', > task_name='schedule_movie', > function_name='import_or_update_movie', > args=json.dumps(args), > vars=json.dumps(kwargs), > enabled=True, > # start_time = request.now, > # stop_time = request.now+datetime.timedelta(days=10), > repeats = 1, > timeout = 3600, > ) > db_scheduler.commit() > > I'm using 7 workers to run tasks (run as ./web2py.py -K milo). If I've > less than 100 tasks everything seems fine (i tested up to 50) but as > soon as i schedule 35k tasks system becomes unstable (high load) due to > database operations and a lot of deadlocks begin to happen between > workers. > > In particular it seems that the queries that set the status of a task as > ASSIGNED is not correct. In fact doing this query: > > db((db_scheduler.scheduler_task.status!='QUEUED')&(db_scheduler.scheduler_task.status!='COMPLETED')).select() > > > > sometimes returns all the tasks as ASSIGNED (even if i've only 7 > workers!). > > This produces a deadlock of like this: > > Traceback (most recent call last): > File "/home/goshawk/web2py/gluon/shell.py", line 214, in run > exec(python_code, _env) > File "<string>", line 1, in <module> > File "/home/goshawk/web2py/gluon/scheduler.py", line 365, in loop > MetaScheduler.loop(self) > File "/home/goshawk/web2py/gluon/scheduler.py", line 257, in loop > task = self.pop_task() > File "/home/goshawk/web2py/gluon/scheduler.py", line 395, in pop_task > grabbed.update(assigned_worker_name='',status=QUEUED) > File "/home/goshawk/web2py/gluon/dal.py", line 7591, in update > return self.db._adapter.update(tablename,self.query,fields) > File "/home/goshawk/web2py/gluon/dal.py", line 1116, in update > self.execute(sql) > File "/home/goshawk/web2py/gluon/dal.py", line 1392, in execute > return self.log_execute(*a, **b) > File "/home/goshawk/web2py/gluon/dal.py", line 1386, in log_execute > ret = self.cursor.execute(*a, **b) > TransactionRollbackError: deadlock detected > DETAIL: Process 7147 waits for ShareLock on transaction 68012578; > blocked by process 5038. > Process 5038 waits for ShareLock on transaction 68012565; blocked by > process 7147. > HINT: See server log for query details. > > Looking at the log in the database i found that the deadlock is indeed > in the query that updates status to ASSIGNED. > > > 2012-08-01 20:07:03 CEST ERROR: deadlock detected > 2012-08-01 20:07:03 CEST DETAIL: Process 5724 waits for ShareLock on > transactio > n 68012520; blocked by process 7147. > Process 7147 waits for ShareLock on transaction 68012547; > blocked by pro > cess 5724. > > Process 5724: UPDATE scheduler_task SET > status='ASSIGNED',assigned_worke > r_name='whisperer#6a223526-9337-4447-bad7-3aac5ab3e261' WHERE > ((((((((scheduler_ > task.status IN ('QUEUED','RUNNING')) AND ((scheduler_task.times_run < > scheduler_ > task.repeats) OR (scheduler_task.repeats = 0))) AND > (scheduler_task.start_time <= '2012-08-01 20:06:37')) AND > (scheduler_task.stop_time > '2012-08-01 20:06:37')) AND > (scheduler_task.next_run_time <= '2012-08-01 20:06:37')) AND > (scheduler_task.enabled = 'T')) AND (scheduler_task.group_name IN > ('main'))) AND (scheduler_task.assigned_worker_name IN > (NULL,'','whisperer#6a223526-9337-4447-bad7-3aac5ab3e261'))); > > Process 7147: UPDATE scheduler_task SET > status='ASSIGNED',assigned_worker_name='whisperer#9f4bae50-24b1-4613-9724-ecfcbc083100' > > > WHERE ((((((((scheduler_task.status IN ('QUEUED','RUNNING')) AND > ((scheduler_task.times_run < scheduler_task.repeats) OR > (scheduler_task.repeats = 0))) AND (scheduler_task.start_time <= > '2012-08-01 20:06:03')) AND (scheduler_task.stop_time > '2012-08-01 > 20:06:03')) AND (scheduler_task.next_run_time <= '2012-08-01 20:06:03')) > AND (scheduler_task.enabled = 'T')) AND (scheduler_task.group_name IN > ('main'))) AND (scheduler_task.assigned_worker_name IN > (NULL,'','whisperer#9f4bae50-24b1-4613-9724-ecfcbc083100'))); > > > Can it be a bug of the scheduler? Where can i find the code about it in > the web2py source tree? > > Thank you > > -- > Vincenzo Ampolo > http://vincenzo-ampolo.net > http://goshawknest.wordpress.com > --