On 01-Jun-11 05:46, Benny Lofgren wrote:
On 2011-05-31 14.45, Artur Grabowski wrote:
The load average is a decaying average of the number of processes in
the runnable state or currently running on a cpu or in the process of
being forked or that have spent less than a second in a sleep state
with sleep priority lower than PZERO, which includes waiting for
memory resources, disk I/O, filesystem locks and a bunch of other
things. You could say it's a very vague estimate of how much work the
cpu might need to be doing soon, maybe. Or it could be completely
wrong because of sampling bias. It's not very important so it's not
really critical for the system to do a good job guessing this number,
so the system doesn't really try too hard.

This number may tell you something useful, or it might be totally
misleading. Or both.

One thing that often bites me in the butt is that cron relies on the
load average to decide if it should let batch(1) jobs run or not.

The default is if cron sees a loadavg>  1.5 it keeps the batch job
enqueued until it drops below that value. As I often see much, much
higher loads on my systems, invariably I find myself wondering why my
batch jobs never finish, just to discover that they have yet to run.
*duh*

So whenever I remember to, on every new system I set up I configure a
different load threshold value for cron. But I tend to forget, so...
:-)

I have no really good suggestion for how else cron should handle this,
otherwise I would have submitted a patch ages ago...


I had tinkered with a solution for this:
Cron wakes up a minute before the batch run is scheduled to run. Cron will then copy a random 4kb sector from the hard disk to RAM, then run either an MD5 or SHA hash against it. The whole process would be timed and if it completed within a a reasonable amount of time for the system then it would kick off a batch job

This was the easiest way I thought of measuring the actual performance of the system at any given time since it measures the entire system and closely emulates actual work.

While this isn't really the right thing to do, I found it to be the most effective on my systems.

Reply via email to