OK i didn't understand that retries happened periodically- i indeed thought 
that it would retry right away, though i agree with you that that should be 
handled at the function level. But if we're handling failures within the 
scheduled function, then now im wondering what is the value in having 
retries at all? Just because the scheduler is running asynchronously does 
not mean it should necessarily be responsible for the scheduled functions' 
unhandled exceptions (which is what the failures are, right)? Maybe we 
should clarify this before discussing the rest- i may be missing something.

btw im leaving for the night but am interested in finishing the discussion- 
ill be back in the morning if you dont hear from me.. 

On Saturday, August 18, 2012 5:14:46 PM UTC-4, Niphlod wrote:
>
> Ok, got the example (but not the "the last go-round is forgotten").
> Let's start saying that your requirements can be fullfilled (simply) 
> decorating your function in a loop and break after the first successful 
> attempt (and repeats=0, retry_failed=-1). Given that, the current behaviour 
> is not properly a limit to what you are trying to achieve, it's only a 
> matter on how implement the requeue facilities on the scheduler.
>
> Lets keep the discussion open...if I got it correctly you're basically 
> asking to ignore period for failed tasks (requeue them and execute ASAP) 
> and reset counters accordingly... right ? Period right now ensures that no 
> more than one task gets executed in n period seconds (and protects you from 
> "flapping", i.e. a continously failing function, and is somewhat required 
> e.g. for respecting webservices API limits, avoid "db pressure" if you're 
> doing heavy operations, etc, etc). 
> Respecting period in every case is "consistency" for me (because I decided 
> that I can "afford" (or "consume" resources) executing that function only 
> one time every hour). 
> You are suggesting to alter this for repeating tasks....what I didn't get 
> is that is required always or only when repeats=0 (that is, incidentally, 
> not consistent :P) ?!
>  
> i.e. What behaviour should you expect from (repeats=2, retry_failed=3, 
> period=3600) ? 
> 2.00 am, failed
> 2.00 am, failed
> 2.00 am, completed
> 3.00 am, failed
> 3.00 am, failed
> 3.00 am, failed 
> ? 
> This is basically what I'm missing. What could possibly be wrong at 2.00am 
> and be right a few seconds later ?
>
> On Saturday, August 18, 2012 10:32:14 PM UTC+2, Yarin wrote:
>>
>> I think retry_failed and repeats are two distinct concepts and shouldn't 
>> be mixed.
>>
>> For example, a task set to (repeats=0, retry_failed=0, period=3600) 
>> should be able to fail at 2:00pm, but  will try again at 3:00pm regardless 
>> of what happened at 2:00. Likewise, if it was set to (repeats=0, 
>> retry_failed=2,period=3600), and failed all three times at 2:00pm, the 
>> retry count should be reset on the next go around. 
>>
>> I think it's safer to presume that if a task is set up for indefinite 
>> repitition, a failure on one repeat should not bring down the whole task- 
>> rather the transactional unit that constitutes a failure should be limited 
>> to the any given attempt, repeated or not.
>>
>> This was one of the reasons i pressed for renaming repeats_failed to 
>> retry_failed- distinct concepts
>>
>>

-- 



Reply via email to