Hi,

If it is not the processes maybe it is the harddisk/filesystem ?

Log files being written by multiple users to different files can have a harsh 
effect on the performance of the operating system if the harddisk isn't up to 
it.

As I recall from my university sessions :) Cadence tends to write hundreds of 
small files, and then bind them all into one big simulation file, not sure if 
it is tweakable - I wasn't the admin back then :)

On Tuesday 10 June 2008 20:14:27 Ira Abramov wrote:
> still at the client with the VLSI tools. Some of the users here are
> running heavy simulations (all userspace, almost 0 kernel time), at
> times a single process can hog the entire system. I have no idea how
> that happens, as this is a fairly modern kernel (the slightly older
> scheduler of RHEL4's 2.6.9) and the Cadence tools are not using
> lightw×–ight procs, so all the load is on a single core (on a quad Xeon)
> and yet once it starts the whole machine is choked, and I can only hit
> the reset.
>
> step 1: I asked them all to nice down the jobs, but they are not very
> happy to. I'm trying to educate them and make them use wrappers (I'm
> introducing condor here anyway)
>
> step2: I have set up the root's .bashrc to renice me up to -4 and so I
> can keep a session active for the next time this happens and at least be
> able to run "top" and "kill"
>
> step3: I need a monitor to alert and maybe kill or renice such processes
> when they pop up and drag the machine down to a halt. till I find out
> who the culprit is, I don't have a procname and so "monit" is not a good
> choice.  any other good ideas?
>
> step4: how do I log this without overlogging? some sort of a smart
> process auditing daemon? I don't want to improvise with shell scripts
> and cron, grepping from PS, because when the excrement impacts the venta
> it may not be able to run (unless I hike the crond's priority to a
> negative nice). I need a small reliable C proggy to do the right thing.
>
> the obvious is maybe to set some ulimits on the users, but I don't want
> to limit heavy processes that do NOT choke the system.



-- 
Noam Rathaus
CTO
[EMAIL PROTECTED]
http://www.beyondsecurity.com

"Know that you are safe."

Beyond Security Finalist for the "Red Herring 100 Global" Awards 2007

================================================================To unsubscribe, 
send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to