[web2py] Re: IMPORTANT on cron jobs, scheduled jobs and delayed jobs

Massimo Di Pierro Mon, 08 Aug 2011 15:34:01 -0700

On Aug 8, 11:55 am, pbreit <pbreitenb...@gmail.com> wrote:
> I definitely like the idea of something simpler. Even though Celery is
> pitched as somewhat easy, I could never make heads or tails of it. I look
> forward to giving this a try.
>
> What are the web2py dependencies? Do you foresee bundling DAL and whatever
> to make it standalone?


In only needs dal.py and globals.py. Could be used standalone, it
would need a main() and I may build it later today. Should not take
much.

> Is SQLite a reasonable DB or will this likely need something that works
> better with concurrency?

If running the tasks take longer than retrieving the task (as it
should be else there is no reason for using this), the db access is
not an issue.

> What is the mechanism to start the scheduler, start on reboot, monitor it,
> etc?

You just need to start web2py and start the background process. There
is nothing else to do. There are some difference from celery.

In celery the celerybeat deamon pushes tasks to the celeryd services
(workers). In gluon/scheduler.py the background processes (workers)
pull the tasks from the database. There is no deamon dealing with
scheduling.

There are three tables.
* task_scheduler stores a list of task, when you want to run
(next_run_time), how ofter (perdiod), how many times (repeats,
times_run), within what time frame (start_time, stop_time), the max
timeout, etc.
* task_run stores the output of each task run. One task_scheduled with
repeats=10 will generate 10 task_run records.
* worker_heartbeat stores the heartbeat of the workers, i.e. the time
when they poll for tasks. Each task_schedule can be:
- queued (waiting to be picked up)
- running (task was picked by a worker)
- completed (was run as many times as requested)
- failed (the task failed and will not be run again)
- overdue (the task has not reported, probably a worker has died in
the middle of it, should not happen under normal conditions.)
A task that does not fail and is schedule to run 3 times will go
through:
queued -> running -> queued -> running -> queued -> running ->
completed
They only run if they are queued and are due to run.

This design allows you to do what you to do what you normally with
cron but with some differences:
- cron is at the web2py level; gluon/scheduler.py is at the app level
(although some apps may share a scheduler)
- cron spawns a process for each task and this created problems to
some users. gluon/scheduler.py runs tasks sequentially in a fixed
number of processes (one in the example).
- tasks can be managed from the admin interface (schedule, start,
stop, restart, change input, read output, etc).
- the same task cannot overlap with itself therefore it is easier to
manage
- tasks are not executed exactly when due, but as close as possible,
in a FIFO order based on the requested schedule and the workload and
resources available. More like celery than cron.

Hope this makes sense.

gluon/scheduler.py is 170 lines of code and you may want to take a
look at what it does.

Massimo

[web2py] Re: IMPORTANT on cron jobs, scheduled jobs and delayed jobs

Reply via email to