@Niphlod: I decided one year ago to use web2py for our projects. And now I have a problem and I have to solve it (shortly) - with or without the group. You are so focused on the scheduler code itself. I search for any hint to understand the problem. It doesn't helps me to know that the mankind has never a problem with the scheduler since decades of wonderful years. ;)
In the meanwhile I found a way to handle well "my" problematic situations (unfortunately a workaround). The answers for the cause are outstanding. A new try: be abstract away from longlive tasks and timeouts in case of normal worker's task processing. The reason for 'my' TIMEOUT is different (please trust me, I read the entire code of scheduler.py and know about the general concept). scheduler.py (worker): infinity loop -> pop a task -> call async function -> create process environment -> start the process[1] -> wait for completion the process or have no mercy and terminate the process if timeout caught. So far so good. Now a more detailed look for [1]. The process environment has an entry function as target (start point) to start the sub process. In the case of scheduler.py it is the function 'executor'. Again, the entry point of this function we want never pass in case of 'zombie' candidates. With pstack I saw the reason: the new process creation process is waiting for a semaphore - sem_wait(). At this point of the sub process nothing is passed in 'executor' function and because of that nothing is processed of my actually task. Of course because the 'executer' didn't call "my" task function. So, the scheduler's executor catchs the timeout (sub process is still waiting for a semaphore) and call termiate() for the sub process. This process is still waiting again and again and again ... The scheduler.py registered in the meanwhile the task as STOPPED and go ahead to pick up the next task. Back to my pstree output with additional comments inside for example: bash(16731) // my shell > \---python2.7(24545) // > scheduler.py (-K) > \-+-python2.7(24564)---{python2.7}(24565) // idling > worker > |-python2.7(24572) // worker > with picked task > \-python2.7(1110) // still > waiting for semaphore (TIMEOUT) > \-python2.7(8647) // still > waiting for semaphore (TIMEOUT) > \-python2.7(11747) // still > waiting for semaphore (TIMEOUT) > \-python2.7(14117) // run the > actually task (RUNNING) > \-python2.7(14302) // still > waiting for semaphore (TIMEOUT) > The actually reason is "waiting for a semaphore". But way? And of course in all propabillity it is not a problem of the scheduler.py code itself ;) Thx again for your endurance. Erwn -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.