Tomas,
There are a couple of things going on here:
1. There is a lot of fragmentation in your meta-data caches (znode,
dnode, dbuf, etc). This is burning up about 300MB of space in your
hung kernel. This is a known problem that we are currently working
on.
2. While the ARC has set its desired size down to c_min (64MB), its
actually still consuming ~800MB in the hung kernel. This is odd.
The bulk of this space is in the 32K and 64K data caches. Could
you print out the contents of ARC_anon, ARC_mru, ARC_mfu, ARC_mru_ghost,
and ARC_mfu_ghost?
-Mark
Tomas Ögren wrote:
Hello.
Having some hangs on a snv53 machine which is quite probably ZFS+NFS
related, since that's all the machine do ;)
The machine is a 2x750MHz Blade1000 with 2GB ram, using a SysKonnect
9821 GigE card (with their 8.19.1.3 skge driver) and two HP branded MPT
SCSI cards. Normal load is pretty much "read all you can" with misc
tarballs and isos since it's an NFS backend to our caching http/ftp
cluster delivering free software to the world.
What happens is that the machine just stops responding.. it can respond
to ping for a while (while userland is non-responsive, including
console) but after a while, that stops too..
Produced a panic to get a dump and tried ::memstat;
unterweser:/scratch/070103# mdb unix.0 vmcore.0
Loading modules: [ unix krtld genunix specfs dtrace ufs scsi_vhci pcisch
ssd fcp fctl qlc md ip hook neti sctp arp usba s1394 nca lofs zfs random
sd nfs ptm cpc ]
::memstat
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 250919 1960 98%
Anon 888 6 0%
Exec and libs 247 1 0%
Page cache 38 0 0%
Free (cachelist) 405 3 0%
Free (freelist) 4370 34 2%
Total 256867 2006
Physical 253028 1976
That doesn't seem too healthy to me.. probably something kernely eating
up everything and the machine is just swapping to death or something..
A dump from live kernel with mdb -k after 1.5h uptime;
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 212310 1658 83%
Anon 11307 88 4%
Exec and libs 2418 18 1%
Page cache 18400 143 7%
Free (cachelist) 4383 34 2%
Free (freelist) 8049 62 3%
The tweaks I have are:
set ncsize = 500000
set nfs:nrnode = 50
set zfs:zil_disable=1
set zfs:zfs_vdev_cache_bshift=14
set zfs:zfs_vdev_cache_size=0
Which according to ::kmem_cache results in about:
0000030002e30008 dmu_buf_impl_t 0000 000000 328 487728
0000030002e30288 dnode_t 0000 000000 640 453204
0000030002e30508 arc_buf_hdr_t 0000 000000 144 103544
0000030002e30788 arc_buf_t 0000 000000 40 36743
0000030002e30a08 zil_lwb_cache 0000 000000 200 0
0000030002e30c88 zfs_znode_cache 0000 000000 200 453200
but those buffers equal to about 550MB..
dnlc_nentries on the hung has gone down to 15000.. (where are the rest
of the ~450k-15k dnode/znodes hanging out?)
Hung kernel:
arc::print
{
anon = ARC_anon
mru = ARC_mru
mru_ghost = ARC_mru_ghost
mfu = ARC_mfu
mfu_ghost = ARC_mfu_ghost
size = 0x358a0600
p = 0x4000000
c = 0x4000000
c_min = 0x4000000
c_max = 0x5e114800
hits = 0xbc860fd
misses = 0x2f296e1
deleted = 0x1d88739
recycle_miss = 0xf7f30c
mutex_miss = 0x24b13d
evict_skip = 0x21501d02
hash_elements = 0x27f97
hash_elements_max = 0x27f97
hash_collisions = 0x1651b43
hash_chains = 0x7ac3
hash_chain_max = 0x12
no_grow = 0x1
}
Live kernel:
arc::print
{
anon = ARC_anon
mru = ARC_mru
mru_ghost = ARC_mru_ghost
mfu = ARC_mfu
mfu_ghost = ARC_mfu_ghost
size = 0x1b279400
p = 0x1a1dcaa4
c = 0x1a1dcaa4
c_min = 0x4000000
c_max = 0x5e114800
hits = 0xef7c96
misses = 0x25efa8
deleted = 0x1db537
recycle_miss = 0xa6221
mutex_miss = 0x12b59
evict_skip = 0x70d62b
hash_elements = 0xcda1
hash_elements_max = 0x1b589
hash_collisions = 0x18e58a
hash_chains = 0x3d16
hash_chain_max = 0xf
no_grow = 0x1
}
Should I post ::kmem_cache and/or ::kmastat somewhere? It's about
2*(20+30)kB..
/Tomas
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss