Tomas,
comments inline...
Tomas Ögren wrote:
On 10 November, 2006 - Sanjeev Bagewadi sent me these 3,5K bytes:
1. DNLC-through-ZFS doesn't seem to listen to ncsize.
The filesystem currently has ~550k inodes and large portions of it is
frequently looked over with rsync (over nfs). mdb said ncsize was
about
68k and vmstat -s said we had a hitrate of ~30%, so I set ncsize to
600k and rebooted.. Didn't seem to change much, still seeing
hitrates at
about the same and manual find(1) doesn't seem to be that cached
(according to vmstat and dnlcsnoop.d).
When booting, the following message came up, not sure if it matters
or not:
NOTICE: setting nrnode to max value of 351642
NOTICE: setting nrnode to max value of 235577
Is there a separate ZFS-DNLC knob to adjust for this? Wild guess is
that
it has its own implementation which is integrated with the rest of the
ZFS cache which throws out metadata cache in favour of data cache.. or
something..
Current memory usage (for some values of usage ;):
# echo ::memstat|mdb -k
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 95584 746 75%
Anon 20868 163 16%
Exec and libs 1703 13 1%
Page cache 1007 7 1%
Free (cachelist) 97 0 0%
Free (freelist) 7745 60 6%
Total 127004 992
Physical 125192 978
/Tomas
This memory usage shows nearly all of memory consumed by the kernel
and probably by ZFS. ZFS can't add any more DNLC entries due to lack of
memory without purging others. This can be seen from the number of
dnlc_nentries being way less than ncsize.
I don't know if there's a DMU or ARC bug to reduce the memory footprint
of their internal structures for situations like this, but we are
aware of the
issue.
Can you please check the zio buffers and the arc status ?
Here is how you can do it :
- Start mdb : ie. mdb -k
::kmem_cache
- In the output generated above check the amount consumed by the
zio_buf_*, arc_buf_t and
arc_buf_hdr_t.
ADDR NAME FLAG CFLAG BUFSIZE BUFTOTL
0000030002640a08 zio_buf_512 0000 020000 512 102675
0000030002640c88 zio_buf_1024 0200 020000 1024 48
0000030002640f08 zio_buf_1536 0200 020000 1536 70
0000030002641188 zio_buf_2048 0200 020000 2048 16
0000030002641408 zio_buf_2560 0200 020000 2560 9
0000030002641688 zio_buf_3072 0200 020000 3072 16
0000030002641908 zio_buf_3584 0200 020000 3584 18
0000030002641b88 zio_buf_4096 0200 020000 4096 12
0000030002668008 zio_buf_5120 0200 020000 5120 32
0000030002668288 zio_buf_6144 0200 020000 6144 8
0000030002668508 zio_buf_7168 0200 020000 7168 1032
0000030002668788 zio_buf_8192 0200 020000 8192 8
0000030002668a08 zio_buf_10240 0200 020000 10240 8
0000030002668c88 zio_buf_12288 0200 020000 12288 4
0000030002668f08 zio_buf_14336 0200 020000 14336 468
0000030002669188 zio_buf_16384 0200 020000 16384 3326
0000030002669408 zio_buf_20480 0200 020000 20480 16
0000030002669688 zio_buf_24576 0200 020000 24576 3
0000030002669908 zio_buf_28672 0200 020000 28672 12
0000030002669b88 zio_buf_32768 0200 020000 32768 1935
000003000266c008 zio_buf_40960 0200 020000 40960 13
000003000266c288 zio_buf_49152 0200 020000 49152 9
000003000266c508 zio_buf_57344 0200 020000 57344 7
000003000266c788 zio_buf_65536 0200 020000 65536 3272
000003000266ca08 zio_buf_73728 0200 020000 73728 10
000003000266cc88 zio_buf_81920 0200 020000 81920 7
000003000266cf08 zio_buf_90112 0200 020000 90112 5
000003000266d188 zio_buf_98304 0200 020000 98304 7
000003000266d408 zio_buf_106496 0200 020000 106496 12
000003000266d688 zio_buf_114688 0200 020000 114688 6
000003000266d908 zio_buf_122880 0200 020000 122880 5
000003000266db88 zio_buf_131072 0200 020000 131072 92
0000030002670508 arc_buf_hdr_t 0000 000000 128 11970
0000030002670788 arc_buf_t 0000 000000 40 7308
- Dump the values of arc
arc::print struct arc
arc::print struct arc
{
anon = ARC_anon
mru = ARC_mru
mru_ghost = ARC_mru_ghost
mfu = ARC_mfu
mfu_ghost = ARC_mfu_ghost
size = 0x6f7a400
p = 0x5d9bd5a
c = 0x5f6375a
c_min = 0x4000000
c_max = 0x2e82a000
hits = 0x40e0a15
misses = 0x1cec4a4
deleted = 0x1b0ba0d
skipped = 0x24ea64e13
hash_elements = 0x179d
hash_elements_max = 0x60bb
hash_collisions = 0x8dca3a
hash_chains = 0x391
hash_chain_max = 0x8
no_grow = 0x1
}
So, about 100MB and a memory crunch..
Interesting ! So, it is not the ARC which is consuming too much memory....
It is some other piece (not sure if it belongs to ZFS) which is causing
the crunch...
Or the other possibility is that ARC ate up too much and caused a near
crunch situation
and the kmem hit back and caused ARC to free up it's buffers (hence the
no_grow flag enabled).
So, it (ARC) could be osscillating between large caching and then
purging the caches.
You might want to keep track of these values (ARC size and no_grow flag)
and see how they
change over a period of time. This would help us understand the pattern.
And if we know it ARC which is causing the crunch we could manually
change the values of
c_max to a comfortable value and that would limit the size of ARC.
However, I would suggest
that you try it out on a non-production machine first.
By, default the c_max is set to 75% of physmem and that is the hard
limit. "c" is the soft limit and
ARC would try and grow upto 'c". The value of "c" is adjusted when there
is a need to cache more
but, it will never exceed "c_max".
Regarding the huge number of reads, I am sure you have already tried
disabling the VDEV prefetch.
If not, it is worth a try.
Thanks and regards,
Sanjeev.
--
Solaris Revenue Products Engineering,
India Engineering Center,
Sun Microsystems India Pvt Ltd.
Tel: x27521 +91 80 669 27521
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss