problem here started as "I can't ensure my app to insert only one task per 
function", that is not a scheduler problem "per se": it's a common database 
problem. Would have been the same if someone created a 
db.define_table('mytable',
     Field('name'),
     Field('uniquecostraint')
)
and have to ensure, without specifying Field('uniquecostraint', 
unique=True) that there are no records with the same value into the column 
uniquecostraint.

>From there to "now I have tasks stuck in RUNNING status, please avoid using 
the scheduler" without any further details, the leap is quite 
"undocumented".

And please do note that scheduler in trunk has gone under some changes: 
there was a point in time where abnormally killed schedulers (as kill 
-SIGKILL the process) left tasks in RUNNING status, that would not be 
picked up by subsequent scheduler processes.

That was a design issue: if a task is RUNNING and you kill scheduler while 
the task was processed, you had no absolutely way to tell what the function 
did (say, send a batch of 500 emails) before it was actually killed. 
If the task was not planned properly it could send e.g. 359 mails, be 
killed, and if it was picked up again by another scheduler after the "first 
killed round" 359 of your recipients would get 2 identical mails.
It has been decided to requeue RUNNING tasks without any active worker 
doing that (i.e. leave to the function the eventual check of what has been 
done), so now RUNNING tasks with a dead worker assigned get requeued.

With other changes (soon in trunk, the previously attached file) you're 
able to stop workers, so they may be killed "ungracefully" being sure that 
they're not processing tasks.

If you need more details, as always, I'm happy to help, and other 
developers too, I'm sure :D

-- 



Reply via email to