Re: deadlock or bad disk ? RELENG_8

Mike Tancsa Sun, 18 Jul 2010 20:02:12 -0700

At 10:34 PM 7/18/2010, Jeremy Chadwick wrote:

On Sun, Jul 18, 2010 at 05:42:14PM -0400, Mike Tancsa wrote:
> At 05:14 PM 7/18/2010, Jeremy Chadwick wrote:
>
> >Where exactly is your swap partition?
>
> On one of the areca raidsets.
>
> # swapctl -l
> Device:       1024-blocks     Used:
> /dev/da0s1b    10485760       108


So is da0 actually a RAID volume "behind the scenes" on the Areca
controller?  How many disks are involved in that set?


yes, da0 is a RAID volume with 4 disks behind the scenes.

Well, the thread I linked you stated that the problem has to do with a
controller or disk "taking too long".  I have no idea what the threshold
is.  I suppose it could also indicate that your system is (possibly)
running low on resources (RAM); I would imagine swap_pager would get
called if a processes needed to be offloaded to swap.  So maybe this is
a system tuning thing more than a hardware thing.

Prior to someone rebooting it, it had been stuck in this state for agood 90min. Apart from upgrading to a later RELENG_8 to get thesecurity patches, the machine had been running a few versions ofRELENG_8 doing the same workloads every week withoutissue. /boot/loader.conf has

ahci_load="YES"
siis_load="YES"

sysctl.conf has

net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=131072
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.sendspace=32768
net.inet.udp.recvspace=65536
kern.ipc.somaxconn=1024
kern.ipc.maxsockbuf=4194304
net.inet.ip.redirect=0
net.inet.ip.intr_queue_maxlen=4096
net.route.netisr_maxqlen=1024
kern.ipc.nmbclusters=131072

I do track some basic mem stats via rrd. Looking at the graphs uptothat period, nothing unusual was happening


CPU: 16.6% user,  0.0% nice,  4.3% system,  0.2% interrupt, 78.8% idle
Mem: 443M Active, 5707M Inact, 1462M Wired, 147M Cache, 828M Buf, 166M Free
Swap: 10G Total, 124K Used, 10G Free

>  smartctl -a -d 3ware,1 /dev/twa0

Now I'm confused -- this indicates twa(4) is involved, not arcmsr(4).

The other controllers (3ware and onboard ich in ahci mode) providerother storage on the same box. I only noted them in that I checkedall their disks for errors of which there were none either. The dmesgfrom the original post enumerates all the devices on the box.


        ---Mike


--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            m...@sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: deadlock or bad disk ? RELENG_8

Reply via email to