Re: [web2py] Re: Scheduler: help us test it while learning

Yarin Mon, 20 Aug 2012 14:41:19 -0700

OK i've come around- agree this is the right set up, let's just make sure 
it's clear in the eventual documentation, as it wasn't obvious to me (not 
much is these days..) - both retries and repeats respect the period. Cool, 
i like it.



On Sunday, August 19, 2012 7:13:15 AM UTC-4, Niphlod wrote:
>
> I didn't say that you must have to handle exceptions exclusively in your 
> functions, but that if you want a functionality of the kind "execute this 
> for the next 2 minutes and retry ASAP 3 times at most" and still you want 
> to have a single scheduler_task record it's the way to go.  Sometimes your 
> functions relies on third-party services that are not "handable" in your 
> functions: you can manage the exception but you still want to execute that 
> function (e.g. you want to send an email but your email server doesn't 
> reply). The mail should be sent anyway, possibly as soon as the email 
> server is available again.... here's where the retry_failed comes handy. Of 
> course if it fails e.g. for 10 times it's better to stop trying and inspect 
> the email server :P
>
> On Saturday, August 18, 2012 11:48:41 PM UTC+2, Yarin wrote:
>>
>> OK i didn't understand that retries happened periodically- i indeed 
>> thought that it would retry right away, though i agree with you that that 
>> should be handled at the function level. But if we're handling failures 
>> within the scheduled function, then now im wondering what is the value in 
>> having retries at all? Just because the scheduler is running asynchronously 
>> does not mean it should necessarily be responsible for the scheduled 
>> functions' unhandled exceptions (which is what the failures are, right)? In 
>> other words, since our scheduler is scheduling we2py functions in a known 
>> environment (unlike environment agnostic task-queue systems, which don't 
>> know how their operations will resolve), shouldn't the onus be on the 
>> scheduled function to handle failures and reschedule if necessary? Maybe we 
>> should clarify this before discussing the rest- i may be missing something.
>>
>> btw im leaving for the night but am interested in finishing the 
>> discussion- ill be back in the morning if you dont hear from me.. 
>>
>> On Saturday, August 18, 2012 5:14:46 PM UTC-4, Niphlod wrote:
>>>
>>> Ok, got the example (but not the "the last go-round is forgotten").
>>> Let's start saying that your requirements can be fullfilled (simply) 
>>> decorating your function in a loop and break after the first successful 
>>> attempt (and repeats=0, retry_failed=-1). Given that, the current behaviour 
>>> is not properly a limit to what you are trying to achieve, it's only a 
>>> matter on how implement the requeue facilities on the scheduler.
>>>
>>> Lets keep the discussion open...if I got it correctly you're basically 
>>> asking to ignore period for failed tasks (requeue them and execute ASAP) 
>>> and reset counters accordingly... right ? Period right now ensures that no 
>>> more than one task gets executed in n period seconds (and protects you from 
>>> "flapping", i.e. a continously failing function, and is somewhat required 
>>> e.g. for respecting webservices API limits, avoid "db pressure" if you're 
>>> doing heavy operations, etc, etc). 
>>> Respecting period in every case is "consistency" for me (because I 
>>> decided that I can "afford" (or "consume" resources) executing that 
>>> function only one time every hour). 
>>> You are suggesting to alter this for repeating tasks....what I didn't 
>>> get is that is required always or only when repeats=0 (that is, 
>>> incidentally, not consistent :P) ?!
>>>  
>>> i.e. What behaviour should you expect from (repeats=2, retry_failed=3, 
>>> period=3600) ? 
>>> 2.00 am, failed
>>> 2.00 am, failed
>>> 2.00 am, completed
>>> 3.00 am, failed
>>> 3.00 am, failed
>>> 3.00 am, failed 
>>> ? 
>>> This is basically what I'm missing. What could possibly be wrong at 
>>> 2.00am and be right a few seconds later ?
>>>
>>> On Saturday, August 18, 2012 10:32:14 PM UTC+2, Yarin wrote:
>>>>
>>>> I think retry_failed and repeats are two distinct concepts and 
>>>> shouldn't be mixed.
>>>>
>>>> For example, a task set to (repeats=0, retry_failed=0, period=3600) 
>>>> should be able to fail at 2:00pm, but  will try again at 3:00pm regardless 
>>>> of what happened at 2:00. Likewise, if it was set to (repeats=0, 
>>>> retry_failed=2,period=3600), and failed all three times at 2:00pm, the 
>>>> retry count should be reset on the next go around. 
>>>>
>>>> I think it's safer to presume that if a task is set up for indefinite 
>>>> repitition, a failure on one repeat should not bring down the whole task- 
>>>> rather the transactional unit that constitutes a failure should be limited 
>>>> to the any given attempt, repeated or not.
>>>>
>>>> This was one of the reasons i pressed for renaming repeats_failed to 
>>>> retry_failed- distinct concepts
>>>>
>>>>

--

Re: [web2py] Re: Scheduler: help us test it while learning

Reply via email to