Sean Liu wrote:
David & Richard,
Thanks for continuing on the discussion under a new topic.
I suppose the discussion board software had a little too much to drink last
night ;-)
First of all - David I am not the Sean from SSHA, but I am in Toronto. I'll
send you a note shortly to have a discussion some time :-)
I can see you actually do understand my points, and the history of issue,
that's good.
And yes I hate to be reactive and just wait for scan rate to tell me that I am
out of memory.
Richard,
Before Solaris 8, vmstat freemem only shows freelist, so after UFS cache took
all of the free memory, freelist will eventually shrink to minfree. And that's
the behavior you talked about.
However with Solaris 8, vmstat freemem shows freelist + cachelist, where
cachelist is the UFS cache which is considered FREE because it gives back
memory to applications.
The same is true of the ARC, except that the mechanism for
returning memory to the free pool does not depend on the
page scanner.
<sidebar>
UFS has a relatively small write buffer and ZFS could consume
up to 1/8 of memory for writes (located in the ARC). There is
some question about whether 1/8 of memory is a good guess.
Anecdotal evidence suggests that there are cases where the 1/8
limit is too large when you have lots of memory and slow disks.
If a better rule can be defined, I'm sure it would be welcome in
the code.
</sidebar>
At this moment, ZFS does not have anything like cachelist to report how much
memory to give back so we are back to reactive mode again.
this would be, to the first order, instantaneously, from kstat -n arcstats
c - c_min
This equation does not belong in the vmstat free list column.
Yes there are other memory related barriers such as the large pages you talked
about, but they don't cause drastic performance degradation as paging/swapping
does. So we might consider them in performance tuning area instead of capacity
planning area.
I think you will find a number of people who will disagree.
And yes memory are cheaper and cheaper these days but with server consolidation
and virtualization, the need for capacity planning is not really decreasing.
...but by definition, virtualization makes capacity planning opaque :-)
Everything (or more things) is shared and resource control at the OS
layer may not be effective.
With ::memstat, yes it may be possible to have a rough idea of free memory but
as you said it's quite intrusive and it requires admin privilege.
If, with dTrace or whatever tools out there we can have a clearly or vaguely
defined algorithm (such as the one developed by benr) to help find out free
memory again it'll help customers a lot.
Just as a real life example, consider two machines, one with UFS/Vxfs, the
other with UFS/ZFS. The first one can, right or wrong, tell you the rough
percentage memory in used, while the latter one can not.
Also here's the ::memstat output from ultra5 (don't laugh, I am still using it)
::memstat
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 40416 315 32%
ZFS File Data 53697 419 42%
Anon 23303 182 18%
Exec and libs 1108 8 1%
Page cache 4333 33 3%
Free (cachelist) 3153 24 2%
Free (freelist) 2175 16 2%
Total 128185 1001
Physical 127490 996
Without ZFS, the two rows marked as Free will give me an idea of free memory,
but it's getting too vague with ZFS.
OK, but you have no idea of the efficiency of the UFS cache, so
the only thing you know is that if you completely eliminate the
UFS cache, performance will stink. And if you drop below
lotsfree, performance will stink. In other words, you don't really
know much wrt performance or capacity planning -- you can't
answer the question of how to size the UFS cache.
Since the ARC reports cache usage and you can limit its maximum
size, you can size it.
Many people take the approach of determining memory consumption
requirements rather than the what-is-free method. In the case of ZFS,
you can set the ARC usage policy on a per-file system basis, which
operates at a finer grain than UFS's directio. Once you determine
that the policy is correct, then you can begin to size the ARC needed
for your performance goals.
Bottom line, vmstat free column is not sufficient for capacity planning
and has not been sufficient since SunOS 3.
-- richard
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org