I didn't say that you must have to handle exceptions exclusively in your functions, but that if you want a functionality of the kind "execute this for the next 2 minutes and retry ASAP 3 times at most" and still you want to have a single scheduler_task record it's the way to go. Sometimes your functions relies on third-party services that are not "handable" in your functions: you can manage the exception but you still want to execute that function (e.g. you want to send an email but your email server doesn't reply). The mail should be sent anyway, possibly as soon as the email server is available again.... here's where the retry_failed comes handy. Of course if it fails e.g. for 10 times it's better to stop trying and inspect the email server :P
On Saturday, August 18, 2012 11:48:41 PM UTC+2, Yarin wrote: > > OK i didn't understand that retries happened periodically- i indeed > thought that it would retry right away, though i agree with you that that > should be handled at the function level. But if we're handling failures > within the scheduled function, then now im wondering what is the value in > having retries at all? Just because the scheduler is running asynchronously > does not mean it should necessarily be responsible for the scheduled > functions' unhandled exceptions (which is what the failures are, right)? In > other words, since our scheduler is scheduling we2py functions in a known > environment (unlike environment agnostic task-queue systems, which don't > know how their operations will resolve), shouldn't the onus be on the > scheduled function to handle failures and reschedule if necessary? Maybe we > should clarify this before discussing the rest- i may be missing something. > > btw im leaving for the night but am interested in finishing the > discussion- ill be back in the morning if you dont hear from me.. > > On Saturday, August 18, 2012 5:14:46 PM UTC-4, Niphlod wrote: >> >> Ok, got the example (but not the "the last go-round is forgotten"). >> Let's start saying that your requirements can be fullfilled (simply) >> decorating your function in a loop and break after the first successful >> attempt (and repeats=0, retry_failed=-1). Given that, the current behaviour >> is not properly a limit to what you are trying to achieve, it's only a >> matter on how implement the requeue facilities on the scheduler. >> >> Lets keep the discussion open...if I got it correctly you're basically >> asking to ignore period for failed tasks (requeue them and execute ASAP) >> and reset counters accordingly... right ? Period right now ensures that no >> more than one task gets executed in n period seconds (and protects you from >> "flapping", i.e. a continously failing function, and is somewhat required >> e.g. for respecting webservices API limits, avoid "db pressure" if you're >> doing heavy operations, etc, etc). >> Respecting period in every case is "consistency" for me (because I >> decided that I can "afford" (or "consume" resources) executing that >> function only one time every hour). >> You are suggesting to alter this for repeating tasks....what I didn't get >> is that is required always or only when repeats=0 (that is, incidentally, >> not consistent :P) ?! >> >> i.e. What behaviour should you expect from (repeats=2, retry_failed=3, >> period=3600) ? >> 2.00 am, failed >> 2.00 am, failed >> 2.00 am, completed >> 3.00 am, failed >> 3.00 am, failed >> 3.00 am, failed >> ? >> This is basically what I'm missing. What could possibly be wrong at >> 2.00am and be right a few seconds later ? >> >> On Saturday, August 18, 2012 10:32:14 PM UTC+2, Yarin wrote: >>> >>> I think retry_failed and repeats are two distinct concepts and shouldn't >>> be mixed. >>> >>> For example, a task set to (repeats=0, retry_failed=0, period=3600) >>> should be able to fail at 2:00pm, but will try again at 3:00pm regardless >>> of what happened at 2:00. Likewise, if it was set to (repeats=0, >>> retry_failed=2,period=3600), and failed all three times at 2:00pm, the >>> retry count should be reset on the next go around. >>> >>> I think it's safer to presume that if a task is set up for indefinite >>> repitition, a failure on one repeat should not bring down the whole task- >>> rather the transactional unit that constitutes a failure should be limited >>> to the any given attempt, repeated or not. >>> >>> This was one of the reasons i pressed for renaming repeats_failed to >>> retry_failed- distinct concepts >>> >>> --