Hi.

   v440, S10U2 + patches

OS and Kernel Version: SunOS XXXXX 5.10 Generic_118833-20 sun4u sparc 
SUNW,Sun-Fire-V440

NFS server with ZFS as a local storage.


We were rsyncing UFS filesystem to ZFS filesystem exported over NFS. After some 
time server which exports ZFS over NFS was unresponsive. Operator decided to 
force panic and reboot server. Further examination showed that system was 
heavily paging probably due to ZFS as no other services are running there.

I had just another problem - looks similar to last one.
I decided to put nfsd into RT class.

I guess ZFS is using all memory for its caches and after some time it fails to 
free it and forces system to paging. This is BAD, really BAD.


More details to previous problem.

bash-3.00# savecore /f3-1/
System dump time: Sat Sep 2 03:31:18 2006
Constructing namelist /f3-1//unix.0
Constructing corefile /f3-1//vmcore.0
100% done: 1043993 of 1043993 pages saved
bash-3.00# cd /f3-1/
bash-3.00#
bash-3.00# mdb 0
Loading modules: [ unix krtld genunix dtrace specfs ufs sd md ip sctp usba fcp 
fctl qlc ssd lofs zfs random logindmux ptm
cpc nfs ipc ]
> ::status
debugging crash dump vmcore.0 (64-bit) from XXXXXX
operating system: 5.10 Generic_118833-20 (sun4u)
panic message: sync initiated
dump content: kernel pages only
>
> ::spa
ADDR STATE NAME
0000060001271680 ACTIVE f3-1
0000060003bd4dc0 ACTIVE f3-2
>
> ::memstat
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 1016199 7939 98%
Anon 4420 34 0%
Exec and libs 736 5 0%
Page cache 36 0 0%
Free (cachelist) 1962 15 0%
Free (freelist) 18338 143 2%

Total 1041691 8138
Physical 1024836 8006
>
> ::swapinfo
ADDR VNODE PAGES FREE NAME
00000600034ab5a0 600012ff8c0 1048763 1028489 /dev/md/dsk/d15
>


We were synchronizing lot of small files over nfs and writing to f3-1/d611. I 
would say that with ZFS it's expected to be
on low memory most of the time but not to the point when host starts to paging.

bash-3.00# sar -g

SunOS XXXXX 5.10 Generic_118833-20 sun4u 09/02/2006

00:00:00 pgout/s ppgout/s pgfree/s pgscan/s %ufs_ipf
[...]
02:15:01 0.03 0.04 0.02 0.00 0.00
02:20:00 0.04 0.04 0.02 0.00 0.00
02:25:00 0.02 0.03 0.01 0.00 0.00
02:30:00 0.02 0.03 0.01 0.00 0.00
02:35:00 0.03 0.03 0.01 0.00 0.00
02:40:01 0.03 0.04 0.03 0.00 0.00
02:45:02 5.98 82.77 93.20 65115.59 0.00
03:39:28 unix restarts
03:40:00 0.35 0.61 0.61 0.00 60.00
03:45:00 0.03 0.06 0.06 0.00 0.00
03:50:00 0.02 0.03 0.02 0.00 0.00
03:55:00 0.02 0.02 0.02 0.00 0.00

bash-3.00# sar -u

SunOS XXXX 5.10 Generic_118833-20 sun4u 09/02/2006

00:00:00 %usr %sys %wio %idle
[...]
02:00:00 0 1 0 99
02:05:00 0 1 0 99
02:10:00 0 1 0 99
02:15:01 0 1 0 99
02:20:00 0 15 0 85
02:25:00 0 34 0 66
02:30:00 0 20 0 80
02:35:00 0 22 0 78
02:40:01 0 45 0 55
02:45:02 0 61 0 38
03:39:28 unix restarts
03:40:00 5 10 0 84
03:45:00 1 1 0 98
03:50:00 0 0 0 100

bash-3.00# sar -q

SunOS xxx 5.10 Generic_118833-20 sun4u 09/02/2006

00:00:00 runq-sz %runocc swpq-sz %swpocc
[...]
02:00:00 0.0 0 0.0 0
02:05:00 1.0 0 0.0 0
02:10:00 0.0 0 0.0 0
02:15:01 0.0 0 0.0 0
02:20:00 1.1 5 0.0 0
02:25:00 1.4 12 0.0 0
02:30:00 2.1 6 0.0 0
02:35:00 3.4 9 0.0 0
02:40:01 2.8 25 0.0 0
02:45:02 4.0 44 116.6 12
03:39:28 unix restarts
03:40:00 1.0 3 0.0 0
03:45:00 0.0 0 0.0 0
03:50:00 0.0 0 0.0 0


Crashdump could be provided off-the list and not for public eyes.
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to