> I actually find that running Wusage 8.0 a few times even with nice-19
> may be implicated in getting the system to spiral downwards. I hesitate
> to mention this as it seems to be working fine on another 7.X server. I
> believe that Wusage is tied to 6.X libraries and I wonder if somehow
> this
>> last pid: 46013; load averages: 105.30, 67.67,
>> 34.45 up 4+23:59:42 19:08:40
>> 629 processes: 89 running, 540 sleeping
>> CPU: 21.9% user, 0.0% nice, 74.5% system, 3.1% interrupt, 0.4% idle
>> Mem: 1538M Active, 11G Inact, 898M Wired, 303M C
> Is the high load average simply a function of processes blocking on
> network io ? On our av/spam scanners for example show a high load avg
> because there are many processes waiting on network io to complete
> (e.g. talking to RBL lists, waiting for DCC servers to complete etc)
>
> Also, is it
>> last pid: 46013; load averages: 105.30, 67.67,
>> 34.45 up 4+23:59:42 19:08:40
>> 629 processes: 89 running, 540 sleeping
>> CPU: 21.9% user, 0.0% nice, 74.5% system, 3.1% interrupt, 0.4% idle
>> Mem: 1538M Active, 11G Inact, 898M Wired, 303M Ca
> Just to confirm we see something similar on the box which runs our stats.
>
> We have updated from 5.4 -> 6.0 -> 6.2 -> 7.0 all have had no effect on
> the lockups which happen when the stats run.
>
> This box is also on an areca controller but it was on an Adaptec and we
> saw pretty much the s
Just to confirm we see something similar on the box which runs our stats.
We have updated from 5.4 -> 6.0 -> 6.2 -> 7.0 all have had no effect on
the lockups which happen when the stats run.
This box is also on an areca controller but it was on an Adaptec and we
saw pretty much the same thing so
> > [ns8]# vmstat -i
> > interrupt total rate
> > irq4: sio0 57065 0
> > irq17: em13989494045554
> > irq18: arcmsr0 558098657 77
> > cpu0: timer 14381393929 20
> What does top -S show ? Most of the load is in system. Does the
> machine in question have a rather large master.passwd file by chance ?
> (http://www.freebsd.org/cgi/query-pr.cgi?pr=75855)
> ---Mike
>
Thanks for your quick reply:
master.passwd is only 9467 (with a ls-l)
TOP -ISM at t
>> The next thing I am doing is going to be removing the QUOTA feature
>> to see if this has any bearing
>> on this problem. It does not appear to be even writing at a heavy
>> load as you can see (almost
>> nothing) but the processes are mostly in UFS when it spirals out of
>> control.
>
>
> What
At 05:29 PM 12/15/2008, Paul MacKenzie wrote:
The next thing I am doing is going to be removing the QUOTA feature
to see if this has any bearing
on this problem. It does not appear to be even writing at a heavy
load as you can see (almost
nothing) but the processes are mostly in UFS when it sp
>
> I would also try disabling polling. Is you scheduler ULE or BSD? For
> an 8 core box, it should be ULE
>
> ---Mike
Hi Mike,
Thanks I will try this now as I have not tried this yet.
Here is the current custom kernel and it is using ULE:
cpu HAMMER
ident MYCOMP
> I would try the change to /etc/nsswitch.conf so that group and passwd
> read
>
> group: files
> passwd: files
>
> At that file size, it sounds like you only have about 200 entries ? I
> doubt its the issue, but its worth a try. I know at around 9,000
> files anything to do with UID lookups (e.g
> I would try the change to /etc/nsswitch.conf so that group and passwd
> read
>
> group: files
> passwd: files
>
> At that file size, it sounds like you only have about 200 entries ? I
> doubt its the issue, but its worth a try. I know at around 9,000
> files anything to do with UID lookups (e.
At 03:27 PM 12/15/2008, Paul MacKenzie wrote:
> What does top -S show ? Most of the load is in system. Does the
> machine in question have a rather large master.passwd file by chance ?
> (http://www.freebsd.org/cgi/query-pr.cgi?pr=75855)
> ---Mike
>
Thanks for your quick reply:
master.
At 12:50 PM 12/15/2008, Paul MacKenzie wrote:
I have Polling, Quota, and the Lagg system enabled on both of the
systems and have tried to make them as similar as possible in the setup.
I would also try disabling polling. Is you scheduler ULE or BSD? For
an 8 core box, it should be ULE
At 02:58 PM 12/15/2008, Paul MacKenzie wrote:
This used to be on a 4.11x system with 1 cpu and only 1gb of ram and
ran flawlessly with much less resources with the same web site code
for a long time. I do not have this problem on the other 7.0
machine. I originally thought it was just a cpu is
At 12:50 PM 12/15/2008, Paul MacKenzie wrote:
> Any suggestions on where to look next? Are there obvious candidates?
After weeks of working on this I now believe that anything that taxes
the writing to the hard drives causes the system CPU numbers to spike
through the roof (approx 80% usage) an
> Replying to my own post ...
>
> I have done a test on the same machine comparing 6.3-p1 to 7.1-PRE.
> The performance is the expected ~6MB/s (because of the lack of cache)
> on 6.3-p1, so the BIOS change doesn't seem to be at fault.
>
> This seems to be a regression somewhere between 6.3 to 7.1.
David Kelly wrote:
On Dec 1, 2008, at 11:45 PM, Jan Mikkelsen wrote:
Replying to my own post ...
I have done a test on the same machine comparing 6.3-p1 to 7.1-PRE.
The performance is the expected ~6MB/s (because of the lack of cache)
on 6.3-p1, so the BIOS change doesn't seem to be at fau
David Kelly wrote:
On Dec 1, 2008, at 11:45 PM, Jan Mikkelsen wrote:
Replying to my own post ...
I have done a test on the same machine comparing 6.3-p1 to 7.1-PRE.
The performance is the expected ~6MB/s (because of the lack of cache)
on 6.3-p1, so the BIOS change doesn't seem to be at fau
On Dec 1, 2008, at 11:45 PM, Jan Mikkelsen wrote:
Replying to my own post ...
I have done a test on the same machine comparing 6.3-p1 to 7.1-
PRE. The performance is the expected ~6MB/s (because of the lack
of cache) on 6.3-p1, so the BIOS change doesn't seem to be at fault.
This seems t
Replying to my own post ...
I have done a test on the same machine comparing 6.3-p1 to 7.1-PRE. The
performance is the expected ~6MB/s (because of the lack of cache) on
6.3-p1, so the BIOS change doesn't seem to be at fault.
This seems to be a regression somewhere between 6.3 to 7.1. The Ar
22 matches
Mail list logo