[web2py] Re: scheduler worker assignment and disable behavior

Niphlod Sat, 07 Jun 2014 09:36:06 -0700

ok, my responses are inline your post

On Friday, June 6, 2014 9:34:00 PM UTC+2, DeanK wrote:
>
> I'm have a few things that need clarification and am also experiencing 
> some odd behavior with the scheduler. I'm using my app's db instance 
> (mysql) for the scheduler.
>


mysql is the uttermost/personal top 1 dislike/non-standard behaving backend 
out there, but we'll manage ^_^
 

>
> at the bottom of scheduler.py:
>
>
> from gluon.scheduler import Scheduler
>
> scheduler = Scheduler(db,heartbeat=3)
>
>
>
> I start my workers like this:
>
> head node:
>
> python web2py.py -K myapp:upload,myapp:upload,myapp:upload,myapp:upload,
> myapp:upload,myapp:download,myapp:download,myapp:download,myapp:download,
> myapp:download,myapp:head_monitorQ 
>
>
5 upload, 5 download, 1 headmonitorQ. 11 workers
 

> 5 compute nodes:
>
> GROUP0="myapp:"$HOSTNAME"_comp_0:compQ"
> GROUP1="myapp:"$HOSTNAME"_comp_1:compQ"
> GROUP2="myapp:"$HOSTNAME"_comp_2:compQ"
> GROUP3="myapp:"$HOSTNAME"_comp_3:compQ"
> GROUP4="myapp:"$HOSTNAME"_comp_4:compQ"
> GROUP5="myapp:"$HOSTNAME"_comp_5:compQ"
> GROUP6="myapp:"$HOSTNAME"_comp_6:compQ"
> MON="myapp:"$HOSTNAME"_monitorQ"
>
> python web2py.py -K 
> $GROUP0,$GROUP1,$GROUP2,$GROUP3,$GROUP4,$GROUP5,$GROUP6,$MON
>
>
> The head node has 4 "upload" and 4 "download" processes.  Each compute 
> node has 7 "compQ" processes that do the actual work.  The hostname based 
> groups are unique so I can remotely manage the workers.  The monitorQ's run 
> a task every 30s to provide hw monitoring to my application.
>

and if by "node" you mean a completely different server, you have 7*5 = 35 
additional workers on top of the 11 on the "head". That's quite a number of 
workers, I hope they are there because you need to process at least 46 
tasks in parallel, otherwise, it's just a waste of processes and groups. 
Don't know about the sentence "hostname based groups are unique so I can 
remotely manage the workers" because by default scheduler workers names 
are  "hostname#pid" tagged, so unique by default. On top of that, the 
default heartbeat of 3 seconds means that even when there are no tasks to 
process, you have a potential of 46 concurrent processes hitting the 
database every 3 seconds...is that necessary ?
 

>
> 1) I have the need to dynamically enable/disable workers to match 
> available hardware.  I was hoping to do this with the disable/resume 
> commands but the behavior isn't what I had hoped (but I think what is 
> intended).  I would like to send a command that will stop a worker from 
> getting assigned/picking up jobs until a resume is issued.  From the docs 
> and experimenting, it looks like all disable does is simply sleep the 
> worker for a little bit and then it gets right back to work.  To get my 
> current desired behavior I issue a terminate command, but then i need to 
> ssh into each compute node and restart workers when i want to scale back 
> up...which works but is less than ideal.
>
> *Is there any way to "toggle" a worker into a disabled state?*
>
> funny you say that, I'm actually working on an "autoscaling" management 
that spawns additional workers (and kills them) when a certain criteria is 
met to deal with spikes of queued tasks. Let's forget about that for a 
second, and deal with the current version of the scheduler... there are a 
few things in your statements that I'd like to "verify"...
1) if you set the status of a worker to "DISABLED", it won't die
2) once DISABLED, it sleeps progressively until 10 times the heartbeat. 
This means that once set to DISABLED, it progressively waits more seconds 
to check with the database for a "resume" command, stopping at ~30 seconds. 
This means that a DISABLED worker, in addition to NOT being able to receive 
tasks, will only "touch" the db every 30 seconds at most. It's basically 
doing nothing, and I don't see a reason why you should kill a DISABLED 
worker because it doesn't consume any resource. It is ready to resume 
processing and you won't need to ssh into the server to restart the workers 
processes.


> 2) A previous post from Niphlod explains the worker assignment:
>
> A QUEUED task is not picked up by a worker, it is first ASSIGNED to a 
>> worker that can pick up only the ones ASSIGNED to him. The "assignment" 
>> phase is important because:
>> - the group_name parameter is honored (task queued with the group_name 
>> 'foo' gets assigned only to workers that process 'foo' tasks (the 
>> group_names column in scheduler_workers))
>> - DISABLED, KILL and TERMINATE workers are "removed" from the assignment 
>> alltogether 
>> - in multiple workers situations the QUEUED tasks are split amongst 
>> workers evenly, and workers "know in advance" what tasks they are allowed 
>> to execute (the assignment allows the scheduler to set up n "independant" 
>> queues for the n ACTIVE workers)
>
>
> This is an issue for me, because my tasks do not have a uniform run time. 
>  Some jobs can take 4 minutes while some can take 4 hours.  I keep getting 
> into situations where a node is sitting there with plenty of idle workers 
> available, but they apparently don't have tasks to pick up.  Another node 
> is chugging along with a bunch of backlogged assigned tasks.  Also 
> sometimes a single worker on a node is left with all the assigned tasks 
> while the other works are sitting idle.
>
> *Is there any built-in way to periodically force a reassignment of tasks 
> to deal with this type if situation?*
>
>
Howdy....4 minutes to 4 hours!!!! ok, we are flexible but hey, 4 hours 
isn't a task, it's a nightmare. That being said, there's no way that on a 
short period of time (e.g., 60 seconds) idle workers won't pick up tasks 
ready to be processed. 
Long story: only a TICKER process assigns tasks, to avoid concurrency 
issues, and it assigns tasks roughly every 5 cycles (that is, 15 seconds), 
unless "immediate" is used when a task gets queued. Consider that as a 
"meta-task" that only the TICKER does. When a worker is processing a task 
(i.e. one of the ones that last 4 hours), it's internally marked as 
"RUNNING" ("instead" of being ACTIVE). When a TICKER is also RUNNING, this 
means that there could be new tasks ready to be processed, but they won't 
because the assignment is a "meta-task". There's a specific section of code 
that deals with this situations and lets the TICKER relinquish its powers 
to let ACTIVE (not RUNNING) workers pick up the assignment process (lines 
#944 and following). 
Finally, to answer your question...if needed you can either:
- truncate the workers table (in this case, workers will simply re-insert 
their record and elect a TICKER)
- set the TICKER status to "PICK". This will only force a reassignment in 
at most 3 seconds vs waiting the usual 15 seconds


3) I had been using "immediate=True" on all of my tasks.  I started to see 
> db deadlock errors occasionally when scheduling jobs using queue_task(). 
>  Removing "immediate=True" seemed to fix this problem.
>
> *Is there any reason why immediate could be causing deadlocks?*
>

I don't see why for tasks that take 4 minutes to 4 hours, you should use 
"immediate". 
Immediate just sets the TICKER status to "PICK" in order to assign task on 
the next round, instead waiting the usual 5 "loops". 
This means that immediate can, and should, be used for very (very) fast 
executing tasks that needs a result within LESS than 15 seconds, that is 
the WORST scenario that can happen, i.e. the task gets queued the instant 
after an assignment round happened.
Let's get the "general" picture here, because I see many users getting a 
wrong idea... web2py's scheduler is fast, but it's not meant to process 
millions of tasks distributed on hundreds of workers (there are far better 
tools for that job). If you feel the need to use "immediate", it's because 
you queued a task that needs to return a result fast. Here "fast" means 
that there is a noticeable change between the time you queue a task and the 
time you get the result back using "immediate" vs not using it. 
Given that "immediate" allows to "gain", on average, 8 seconds, in my POV 
it should only be used with tasks whose execution time is less than 20-30 
seconds. For anything higher, you're basically gaining less than the 20%. 
For less than 20 seconds, if other limitations are not around, you'd better 
process the task within the webserver, e.g. via ajax, or look at celery 
(good luck :D)
To answer the "deadlock" question, if you see the code, all that 
"immediate" does is an additional update on the status of the TICKER. 

This makes a ring bell because - also if "immediate" is not needed in my 
POV as explained before - points out that your backend can't sustain the db 
pressure of 46 workers. Do you see any "ERROR" lines in the log of the 
workers ?

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[web2py] Re: scheduler worker assignment and disable behavior

Reply via email to