Re: What is the best way to implement time-based / cronjob actions in a Django app?

ringemup Thu, 14 Oct 2010 06:00:43 -0700

Thank you, Brian and Shawn, for the further explanations!

On Oct 13, 11:45 pm, Brian Bouterse <bmbou...@gmail.com> wrote:
> RabbitMQ implements a standards based protocol called
> AMQP<http://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol>,
> which provides asynchronous, reliable delivery of messages.  The broker
> simply passes messages around.  Celeryd processes join a message broker
> (RabbitMQ) and pull messages from any number of "queues."  A default queue
> is used if no queue names are specified.  A celeryd worker can be configured
> to pull from any number of queues.
>
> The reliable nature of AMQP based message buses/brokers is that when you
> submit a message into the message broker, you are guaranteed it will not be
> lost, and if a worker is available, it will be delivered.  The asynchronous
> part is that if you submit a message into the broker and there are no
> workers to handle the message, it will queue, reliably with other messages.
>  Once a worker comes online messages will begin being handled.
>
> There is no reason to require the broker to live on a separate machine.  It
> is common in our environment to run a RabbitMQ server process on a server,
> and on that same server configure celeryd worker processes to use localhost
> to connect to the message broker.
>
> To create rabbitmq users I think root permissions are required.  However, I
> do not this root is required for rabbitMQ to be run.  As long as celery has
> the right username/password/server information, and has appropriate
> permissions to run you your task code, you should be able to run celery as
> any non-root user.
>
> In terms of my explanation of celery, it basically describes the data schema
> that is submitted in the AMQP messages which are reliably and asynchronously
> delivered.  For instance, the task function, and all of the submitted
> arguments are contained in the message, and celeryd unpacks these message,
> with this format, and call the task function indicated with those arguments.
>  The return results are serialized and stored in the results database that
> celeryd is configured to work with.  In the case of django-celery these
> settings are found in the django settings.py file, but with celery on its
> own it is in celeryconfig.py  These results can be checked on later by
> task-id which can be easily stored in a plain django model.  We have built
> several analytics and research compute clusters using celery and
> django-celery.
>
> Best,
> Brian
>
>
>
> On Wed, Oct 13, 2010 at 5:07 PM, ringemup <ringe...@gmail.com> wrote:
> > Thank you for taking the time to explain that, Shawn -- it makes
> > everything a LOT clearer.
>
> > If you could spare the time, I'm curious about a couple of aspects of
> > the architecture:
>
> > - What is the purpose of having a separate broker and daemon?
> > - How does the broker know when to pass the task back to Celery?
> > - Is there a reason other than resource usage for the broker to live
> > on a different machine?
>
> > Also, can this all be run in a shared hosting environment, or are root
> > permissions needed to install Celery and RabbitMQ?
>
> > On Oct 13, 4:43 pm, Shawn Milochik <sh...@milochik.com> wrote:
> > > On Oct 13, 2010, at 4:11 PM, ringemup wrote:
>
> > > >> It's surprisingly easy to get set up with nothing more than the
> > tutorial/into for django-celery. If anyone has problems with it I'd be happy
> > to try to assist.
>
> > > > Thanks, I might take you up on that.
>
> > > >> Although getting everything working is fairly easy, in my opinion the
> > docs aren't too clear on how the big picture really works for first-timers.
>
> > > > Yeah, that's a big reason I never tried to use it.  Would you be
> > > > willing to share a high-level overview with us?
>
> > > > Thanks!
>
> > > Okay, so here's how it works, as I understand it. I hope Brian will jump
> > in and correct where necessary.
>
> > > So, as I see it there are three moving parts.
>
> > > 1. Your application.
>
> > >     A. Your application will have, somewhere, some configuration
> > information which allows it to connect to the message broker.
> > >     B. It will also have one or more files containing callable code
> > (probably functions), which are decorated with a Celery decorator. These are
> > referred to as "tasks".
> > >     C. It will have other code which will call these decorated functions
> > when you want things to run asynchronously (in your views, for example).
>
> > > 2. The broker (traditionally RabbitMQ).
>
> > >     A. The broker probably lives on another machine, and runs as a
> > service.
> > >     B. The broker knows nothing about your code or applications.
> > >     C. The broker simply receives messages, holds onto them, and passes
> > them on when requested.
>
> > > 3. The Celery Daemon (the simplest use-case)
>
> > >     A. The Celery daemon is a separate process running on the same
> > machine as your application.
> > >     B. The Celery daemon uses the same config info (probably the same
> > config file) as your application.
> > >     C. The Celery daemon polls the broker regularly, looking for tasks.
> > >     D. When the daemon retrieves a task, it runs it, using the code in
> > your application's "tasks" files.
>
> > > Basic working example:
>
> > >         1. You have a function in your tasks.py called update_user. It
> > accepts an integer as its only argument, which should be the primary key of
> > a user in your User table. It is decorated by the Celery decorator "task."
>
> > >         @task
> > >         def update_user(pk):
>
> > >             #trivial sample function
> > >             user = User.objects.get(pk = pk)
> > >             user.last_login = datetime.now()
> > >             user.save()
>
> > >         2. Your application imports your update_user function from your
> > tasks file. One of your views calls it like this:  update_user.delay(
> > request.user.pk).
> > >         Note that the delay() method is of the Celery task decorator.
> > >         This call to update_user.delay() returns a UUID which you may
> > store for later retrieval of the results.
>
> > >         3. Celery passes a serialized version of this function call to
> > the broker. Something like a plain-text "update_user(123)."
>
> > >         4. The Celery daemon, in its continual polling process, is handed
> > a message containing something like 'update_user(123).' It is aware of the
> > update_user function because it has been configured to use the task files in
> > your application, so it calls your update_user function with the argument
> > 123. At this point your code runs. The celery daemon records the result
> > using whatever method specified in your Celery config file. This could be in
> > MongoDB, passed back to the broker, or several others. Optionally, if the
> > code execution fails, Celery may e-mail you.
>
> > >        5. (Optional) Your application uses the UUID it received in step 2
> > at a later time to ascertain the status of the job. If the result was stored
> > with the broker, then it may only be retrieved once; it is considered just a
> > plain-old plain-text "message" to the broker, and after being passed on it
> > is no longer stored. If the result was stored in a database (such as
> > PostgreSQL or MongoDB), then you can request it repeatedly.
>
> > > I hope this helps, and that others will correct me where I'm blatantly
> > wrong. I have intentionally simplified some things so that the basic flow is
> > more understandable; much more complex setups are possible, especially ones
> > which allow multiple servers to run Celery daemons (and individual servers
> > to run multiple daemons). For example, you may have one server handle
> > communication tasks (such as sending e-mail and SMS messages), while another
> > server handles processing of images. It may be beneficial to do one on your
> > application server (where your Django app lives), while doing the more
> > resource-intensive stuff (such as transcoding video uploads) on another
> > machine.
>
> > > Shawn
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Django users" group.
> > To post to this group, send email to django-us...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > django-users+unsubscr...@googlegroups.com<django-users%2bunsubscr...@googlegroups.com>
> > .
> > For more options, visit this group at
> >http://groups.google.com/group/django-users?hl=en.
>
> --
> Brian Bouterse
> ITng Services


-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: What is the best way to implement time-based / cronjob actions in a Django app?

Reply via email to