Hi, there is an issue driving me crazy with the web2py scheduler:

If you return something that has a huge size then it will always timeout; 
even if the scheduler task correctly finishes. Let me explain with an 
example:

def small_test():
    s = 's'*1256018
    another_s = s
    #print s
    #print another_s
    #print 'FINISHED PROCESS'
    return dict(s = s, another_s = another_s, f = 'finished')

small_test is the function to execute, as you can see a string full of 's' 
1256018 times is. Simple

So when you enqueue the scheduler every time the output is the same: 
http://prnt.sc/a9iarj (screenshot of the TIMEOUT)

As you can see from the screenshot, the process actually finished; while 
logging the scheduler output shows the following:

DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:   work to do 1405
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:    new scheduler_run 
record
INFO:web2py.scheduler.PRTALONENETLAPP-SRV#24475:new task 1405 "small_test" 
portal/default.small_test
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: new task allocated: 
portal/default.small_test
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:   task starting
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:    task started
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:    new task report: 
COMPLETED
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:   result: {"s": 
"ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss$
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:    freeing workers that 
have not sent heartbeat
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:    freeing workers that 
have not sent heartbeat
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:    freeing workers that 
have not sent heartbeat
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording 
heartbeat (RUNNING)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:    task timeout
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: recording task report in 
db (TIMEOUT)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: status TIMEOUT, stop_time 
2016-02-29 11:56:52.393706, run_result {"s": "sssssssssssssssssssssssssss$
INFO:web2py.scheduler.PRTALONENETLAPP-SRV#24475:task completed (TIMEOUT)
DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:looping...
INFO:web2py.scheduler.PRTALONENETLAPP-SRV#24475:nothing to do



As you can see there is a TaskReport object in the queue with a COMPLETED 
status (I know this because I read the scheduler.py code of web2py) So I'm 
pretty sure the task finishes quite fast but then it hangs.

So I did another test, that doesn't directly use the scheduler but only 
calls the executor method from the scheduler and usess process; just like 
the scheduler would:

from gluon.scheduler import Task
from gluon.scheduler import executor
t = Task(app='portal', function='small_test', timeout = 120)
import logging
logging.getLogger().setLevel(logging.DEBUG)
import multiprocessing
queue = multiprocessing.Queue(maxsize = 1)
out = multiprocessing.Queue()
t.task_id = 123
t.uuid = 'asdfasdf'
p = multiprocessing.Process(target=executor, args=(queue, t, out))
p.start()
p.join(timeout = 120)
p.is_alive()


when the join finishes waiting (2 minutes) if you check for p.is_alive() it 
always returns True; but when you do a queue.get() and then instantly check 
for p.is_alive() the process finishes!!!!! 

So i noticed the problem is from multiprocessing library, due to the fact 
that it can't handle lots of data from a queue (which seems kind of strange 
for my case, but I don't know how it is implemented); anyways i found this 
bug: http://bugs.python.org/issue8237 and http://bugs.python.org/issue8426

The interesting part is it is actually documented (I didn't knew that):
https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming

But in my current implementation this will happen quite often, I'll work on 
a work-around but I would really like that web2py scheduler could handle 
large data output from my processes for me, but well that is my wish and I 
would like to have some guidance on this issue and avoid a work-around.

Anyway, this should be documented somewhere in web2py too (that probably 
could had saved me a week of code reading and debugging); or it should be 
managed somehow (I wouldn't naturally expect an output limit besides the 
database implementation).

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to