Hey, I don't have a problem with that, just saying the lock-and-read mechanism can increase the overhead/latency significantly in certain setups. If a two-file approach is a problem, would you consider avoiding the cPickle read if I can find a way to do it via setting ctime/mtime (we set a mtime older then ctime, and when we finish, we set now() as mtime) ?
Alternatively, if it turns out that can't be done on all platforms and we only need to check when it started and whether it is running, we could simplify running checks by file size. When the cron starts, we create the file with zero size, and when it finishes we just write a space to it. That way we can check whether it's running or not just by looking at the file size (a lot cheaper that reading/pickling). The bottom line is that I would like to avoid having to do locking for the read-checks (we write only once a minute, but read quite a bit more as we scale upwards - like on a shared volume or on a busy site with softcron). On Apr 12, 11:27 pm, mdipierro <mdipie...@cs.depaul.edu> wrote: > On Apr 12, 1:19 pm, AchipA <attila.cs...@gmail.com> wrote: > > > Why do we need the time range ? If the tasks are overlapping it's > > their responsibility to handle that (I know this is arguable, but > > that's how 'standard' cron works). > > This is also as it works in newcron. The problem is that if for any > reason, the main process that loops and spans the tasks gets stuck, it > may give rise to a proliferation of processes that may crash the os. > The current mechanism is similar to the one you originally implemented > but you used a n additional file to determine if the cron was > completed. I use the completion date. > > > Also, we can easily store two > > timestamps (slightly hackish, but mtime and ctime can be set > > separately), would have to check whether that is supported on all > > platforms. Of course there are many other ways of reading data without > > opening files, I'm just pondering about alternatives as the current > > locking mechanism causes some problems on my shared-volume based multi- > > server setup (that's why I used 'move' originally as it's atomic and > > works well with netwok shares). > > > On Apr 12, 5:12 pm, mdipierro <mdipie...@cs.depaul.edu> wrote: > > > > Because they os timestamp only can only tell you when a task has > > > started (or stopped, depending on when it was created) it does not > > > contain enough information to give you a time range (time and stop). > > > Cron needs to know when the previous crondance started and whether is > > > was completed or not. The original implementation was doing the check > > > using locks and that resulted in a large number of try... except... > > > The current implementation removes most of the try.. except... (people > > > complained about that) and just stores start_time, stop_time > > > explicitly in a picke. > > > > On Apr 12, 8:00 am, AchipA <attila.cs...@gmail.com> wrote: > > > > > To correct myself, it seems the cron in web2py no longer uses the > > > > filesystem timestamps, but cPickles timestamps from/to the lock file. > > > > I'm not sure why Massimo changed it, but this *is* a bigger overhead > > > > than it was previously (as it needs to do file locking and > > > > cPickle.load() on every single request - as opposed to a simple cached > > > > non-locking filesystem call). > > > > > On Apr 1, 8:20 pm, AchipA <attila.cs...@gmail.com> wrote: > > > > > > Exactly, hardcron checks once a minute, softcron checks on each page > > > > > load. The 'check' is calling a function or two and comparing a file's > > > > > timestamp, so not *that* much more expensive. > > > > > > On Apr 1, 7:51 pm, Jonathan Lundell <jlund...@pobox.com> wrote: > > > > > > > On Apr 1, 2010, at 10:37 AM, AchipA wrote: > > > > > > > > There is some overhead, but efficiency is a disputable term - > > > > > > > there is > > > > > > > certainly more overhead than hardcron, but IMO not in a way that > > > > > > > would > > > > > > > affect overall performance unless you're running it on a site > > > > > > > that has > > > > > > > hundreds of thousands of hits per day... > > > > > > > Perhaps we could change (or eliminate) the wording. How about > > > > > > simply 'Using softcron'? > > > > > > > I'm curious: what is the extra overhead of soft vs hardcron? Just > > > > > > that it does a test on each page access? I'm guessing that's pretty > > > > > > cheap. > > > > > > > > On Apr 1, 5:40 pm, Jonathan Lundell <jlund...@pobox.com> wrote: > > > > > > >> Section 4.17 (cron) mentions hard vs softcrondefaults, but > > > > > > >> doesn't say how to override them. > > > > > > > >> Section 4.1 (cli) doesn't list --softcron > > > > > > > >> The startup message for softcronsays: 'Using softcron (but this > > > > > > >> is not very efficient)' > > > > > > > >> In what sense "not efficient"? I understand that the timing is > > > > > > >> less consistent, but is there really more overhead? softcron > > > > > > >> seems like a pretty reasonable choice if all you're doing it > > > > > > >> deleting expired sessions. > > -- To unsubscribe, reply using "remove me" as the subject.