[web2py] Re: scheduler worker assignment and disable behavior

Niphlod Thu, 12 Jun 2014 13:32:40 -0700


On Thursday, June 12, 2014 7:37:26 PM UTC+2, DeanK wrote:
>
> Thanks for the detailed response! Lot's to cover so here we go......haha
>
>
> mysql is the uttermost/personal top 1 dislike/non-standard behaving 
>> backend out there, but we'll manage ^_^
>
>
> Interesting.  What do you like more?
>


in the "real life" I'm a MSSQL dba. I enjoy using it but as far as 
concurrency issues (that is more likely your problem), I observed that more 
than 20 active workers pose a problem. To be fair I didn't test the new 
shiny in-memory tables, but that's completely not fair (and not free). For 
my OSS project (and by measurements on my rig) postgresql handles a lot 
better the kind of queries the scheduler needs to be executed, so if you're 
not "locked in" with mysql, I'll definitely give it a try (at least just 
for the scheduler_* related tables).
 

>
>
>
> and if by "node" you mean a completely different server, you have 7*5 = 35 
>> additional workers on top of the 11 on the "head". That's quite a number of 
>> workers, I hope they are there because you need to process at least 46 
>> tasks in parallel
>>
>
>  
> So I am *certainly* using the scheduler in a way it wasn't intended, but 
> that's part of the fun right? I'm in an interesting situation where I have 
> access to a "cluster" of 5 computers that each have 7 GPUs.  There 
> currently isn't a proper task scheduler (e.g. SGE, LSF, slurm, etc.) 
> installed yet...sot it's not really much of a cluster beyond a shared 
> filesystem...but I want to use the system now instead of waiting for 
> everything to get setup.  I don't have sudo access....so I thought: 
> hey...in less than a day's work I can set up web2py + the built-in 
> scheduler + the comfort scheduler monitor and be able to run distributed 
> GPU processing with a shiny web2py frontend!  That is why I need 7 "compQ" 
> workers per machine (1 per GPU).  It is also why I include a unique group 
> name for *each* worker (hostname_compXX). This lets me issue 
> terminate/disable commands to a group and be able to stop specific workers. 
>  I need this to control which GPUs will pick up work, since I can't use all 
> of them all the time.
>

ok, seems fair. But if you're using the comfort scheduler monitor, you can 
cherry-pick the exact worker to disable (or kill, or terminate) without 
fiddling with group_names. This is valid only if comp_01 can process the 
exact same tasks as comp_02, and so on, and if hostnames are different, 
because the worker_name has the hostname in it.

 

>
> From your description, it is still my understanding that after ~30 seconds 
> a disabled worker will go back to work. 
>

Nope. A DISABLED worker will SLEEP for 30 seconds and then simply mark 
itself as "beating", while not processing tasks. This alleviates the db 
pressure because if you don't need to process tasks, there's no need to hit 
the db every 3 seconds to look for new tasks (but the worker still needs to 
"checkin" every 30 seconds, marking the last_heartbeat column). This is the 
relevant excerpt that makes the loop "jump" without doing nothing

https://github.com/web2py/web2py/blob/master/gluon/scheduler.py#L635

and this 
https://github.com/web2py/web2py/blob/master/gluon/scheduler.py#L973

shows that no tasks can be assigned to a worker that is NOT active. 
 

>
> Howdy....4 minutes to 4 hours!!!! ok, we are flexible but hey, 4 hours 
>> isn't a task, it's a nightmare
>>
>
> Haha...again...obviously stretching things here, but it is pretty much 
> working which is cool.  This more or less makes sense, and I'm definitely 
> seeing the impact of a long running task being the TICKER.   When this 
> happens nothing moves out of the queued state for a LONG time.  Based on 
> what you've said, forcing a new TICKER should make this go away I think. So 
> i may need a simple script i can run to clear the worker table when i see 
> this happen. This won't re-assign already assigned tasks though, correct? 
>  For example I see stuff like this:
>
> 2 workers: A and B
> 4 tasks: 1,2,3,4 - tasks 1 and 2 take 5 minutes, tasks 3 and 4 take 1 hour.
>
> Worker A gets assigned tasks 1 and 2, B gets 3 and 4.  Tasks 1 and 2 
> finish in 10 minutes.  Worker A sits idle while worker B runs for 2 hours. 
>  Is this a correct understanding how things work, or if I force the ticker 
> to PICK it will actually reassign these tasks to an idle worker?
>
>
well, this should definitely NOT happen 
(https://github.com/web2py/web2py/blob/master/gluon/scheduler.py#L990). All 
QUEUED or ASSIGNED tasks are shuffled among active workers every ~15 
seconds, to "counteract" exactly what you describe. To be fair, workers are 
assigned tasks that can process, and this is specified by the 
group_name(s). Assuming A and B have a common group_name and 1,2,3,4 are 
tasks with that group_name.....

......beginning......Ticker assigns tasks. Wait a few seconds and....
1 is RUNNING (A got it)
2 is ASSIGNED to A (A will eventually process it)
3 is RUNNING (B got it)
4 is assigned to B (B will eventually process it)
.........everybody is working, so noone reassigns tasks, because there are 
no workers available to process them ^_^.....
.......5 minutes later.......
1 is COMPLETED by A
2 is RUNNING (because it was assigned to A)
3 is still RUNNING on B
4 is still assigned to B
.........everybody is working, so noone reassigns tasks, because there are 
no workers available to process them ^_^.....
........5 minutes later........
2 is COMPLETED by A
3 is RUNNING on B
4 is still ASSIGNED to B
........at most 15 seconds later......
........given that A is now "free", it gets "elected" as a TICKER and does 
an assignment
3 is RUNNING on B
4 is ASSIGNED to B, but B is busy, so 4 gets "passed" to A
........3 seconds later......
3 is RUNNING  on B
4 is RUNNING on A

So, you could observe "lots of QUEUED tasks not moving around" only if 
there are workers "free" to do something..... if nothing is moving, there's 
either a conflict given to db pressure (but should be logged) or simply 
there's nothing to move around, because everyone is already busy processing 
something.


> I don't see why for tasks that take 4 minutes to 4 hours, you should use 
>> "immediate". 
>>
>
> I totally agree.  It kind of got copy/paste carried over from other code 
> for a web app where it did make sense to use immediate.  I'm not doing it 
> anymore.  I did go back an check the output from the workers and i do see 
> some errors.  There are some application specific things from my code, but 
> also two others of this flavor:
>
> 2014-05-23 13:18:10,544 - web2py.scheduler.XXXXXX#16361 - ERROR - Error 
> cleaning up 
>
> Traceback (most recent call last):
>   File "/home/xxxxx/anaconda/lib/python2.7/logging/handlers.py", line 76, 
> in emit
>     if self.shouldRollover(record):
>   File "/home/xxxxx/anaconda/lib/python2.7/logging/handlers.py", line 157, 
> in shouldRollover
>     self.stream.seek(0, 2)  #due to non-posix-compliant Windows feature
> IOError: [Errno 116] Stale file handle
> Logged from file scheduler.py, line 822
>
> Note: I do have the web2py logging setup, but i'm not using it for 
> anything anymore so i could delete the config file.  It looks like all the 
> output from the workers is getting put into the web2py log file. Maybe one 
> worker is causing the log file to roll over while another is trying to 
> write to it?
>

Multiprocessing logging to the same file has always been a pain (for 
everybody, not just python). That's why the only "safe" backend to log to 
(at least on the standard lib) is syslog (it's properly documented 
https://github.com/web2py/web2py/blob/master/examples/logging.example.conf#L9).
 

> Finally, looking at my notes I've seen some other weird behavior.  I'm not 
> sure this is the place for it to go since this post is ridiculously dense 
> to begin with, so let me know if you want me to repost it somewhere else
>

I think it's good to stay here, also for future prying eyes.

>
>    - If the ticker is a worker who is running a long task, nothing gets 
>    assigned for a very long time (I think until the job completes).  I think 
>    we've covered this behavior above and it makes sense.  Forcing a new 
> ticker 
>    should fix it.
>    
> As explained before, "nothing gets assigned for a long time" can only 
happen if ALL "active" workers are busy processing tasks. This is expected 
because, also if tasks would be assigned to some other worker, it will 
still be busy and not process newly assigned tasks. When a TICKER is busy, 
it does its best to "relinquish" the TICKER status to 1 "free" active 
worker. As soon as this happens, tasks gets "assigned" and the whole thing 
begins again. With long-running tasks you may observe that only the "new" 
ticker itself processes a task as soon as the "assignment" happened, but 
that's because it got elected because it was "free to do work" in the first 
place. You can still "force" a reassignment, but it ONLY help IF there's a 
free worker, and if is there, it will soon - at least - become a ticker by 
itself and process - at least - a new task. 

>
>
>    - Sometimes I see tasks that complete successfully, but get re-run for 
>    some reason (i've only seen it with my long running 3-4 hr tasks).  
> Looking 
>    in the comfy monitor, the task has a complete run and i see the output, 
> but 
>    it gets scheduled and run again.  Since my code does cleanup after the 
>    first run, the input data is missing so the second run fails (which is how 
>    i noticed this).  Not sure why this is happening and may need to try to 
>    figure out how to reproduce reliably for debugging.
>    
>  That's fairly strange. If no "repeats" are required, and the task didn't 
raise exceptions or go into timeout, it gets marked as COMPLETED 
https://github.com/web2py/web2py/blob/master/gluon/scheduler.py#L804

>
>
>    - I've seen situations where I know a task is running, but things are 
>    still listed as assigned.  I know this because i can see how many tasks 
> are 
>    physically running on the worker nodes and can compare that to what the 
>    scheduler is reporting.  I would assume tasks i know to be running should 
>    equal tasks listed as running.
>
> Again, this is strange. Just before a task gets executed by a worker, it 
gets marked as running. 
https://github.com/web2py/web2py/blob/master/gluon/scheduler.py#L715 . I 
can only assume that at this point your backend is SERIOUSLY compromised 
(i.e. can't accomodate the "updates", but then again those will be logged, 
because every "heavy" operation is trapped), or your task is backfilling 
somehow the process misusing the standard-output/standard-error. 
This is a issue-per-se in Unix, but it gets kinda "worse" in Windows. Tasks 
are assumed to return result and not to print zillions of data on the 
stdout/stderr, because it makes difficult for the inner-workings of 
multiprocessing to pipe back/forth the streams.
 

>
> Thanks again for the help and making all this easy, awesome, and free.  
>

Stay tuned for an enhanced version of the scheduler running on redis. Now 
that it runs pretty fairly on windows too, I'm giving it a serious whirl. 
It'll definitely wipe out any concurrency issue whatsoever.

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[web2py] Re: scheduler worker assignment and disable behavior

Reply via email to