Hi Ben,
Ben Rockwood wrote:
m...@bruningsystems.com wrote:
Hi Ben,
Ben Rockwood wrote:
I'm curious as to why memory statistics seems to be very difficult to be
accurate about. If you use kstats, mdb ::memstat, and add up VSZ/RSS
from ps, you get numbers that are different, although close.
Can anyone shed some light on why this is? I'm assumed that ::memstat
is the most accurate measure and I'm comparing my numbers against it,
but perhaps it is not the best validation?
Have you tried getting the numbers on a crash dump? If you are
doing this on a running system, I would expect the numbers to
fluctuate.
No, I'm not interested in getting so exact that the numbers don't matter
because the system isn't running. :)
Here is an example:
::memstat
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 377207 1473 4%
Anon 37794 147 0%
Exec and libs 5055 19 0%
Page cache 12200 47 0%
Free (cachelist) 18828 73 0%
Free (freelist) 7934179 30992 95%
Total 8385263 32754
Physical 8385262 32754
zfs:0:arcstats:size 16504880 <--- 16,504,880
bytes
# kstat -p | grep -i system_pages
unix:0:system_pages:availrmem 8005191
unix:0:system_pages:class pages
unix:0:system_pages:crtime 0
unix:0:system_pages:desfree 65509
unix:0:system_pages:desscan 25
unix:0:system_pages:econtig 4224880640
unix:0:system_pages:fastscan 1681006
unix:0:system_pages:freemem 7954235
unix:0:system_pages:kernelbase 0
unix:0:system_pages:lotsfree 131019
unix:0:system_pages:minfree 32754
unix:0:system_pages:nalloc 25203900
unix:0:system_pages:nalloc_calls 14944
unix:0:system_pages:nfree 23916528
unix:0:system_pages:nfree_calls 9475
unix:0:system_pages:nscan 0
unix:0:system_pages:pagesfree 7954235 <--- 31,816,940 (31071 MB)
unix:0:system_pages:pageslocked 380071
unix:0:system_pages:pagestotal 8385262
unix:0:system_pages:physmem 8385263 <--- 33,541,052 (32754 MB)
unix:0:system_pages:pp_kernel 379926 <--- 1,519,704k (1484 MB)
unix:0:system_pages:slowscan 100
unix:0:system_pages:snaptime 2181598.72132125
So if we look at pages for Kernel, kstat pp_kernel says 379926 but
::memstat says 377207. Of free, kstat pagesfree says 7954235 while
::memstat says 7934179. The numbers are very close (within about 100MB
on a system with 32GB of memory) but not exact.
The pagesfree value that kstats reports is the same variable used by
::memstat
for free memory. Both of these are the freemem variable. freemem is
updated
every clock tick.
If you try to work out "used" memory by, for instance, taking total mem,
subtracting free mem, then dividing it between kernel and user you
similarly get inconsistence numbers depending on where you get your
numbers (::memstat, vs kstat, vs adding up ps RSS numbers).
Based on the way the ::memstat works, it should be the most accurate value.
However, it makes multiple passes through all pages of pageable memory,
so things
can change during the passes. In fact, this could happen even if it only
made a single pass. What I find a little more interesting is the following.
First, ::memstat output...
::memstat
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 110120 430 21%
Anon 192334 751 37%
Exec and libs 26664 104 5%
Page cache 2254 8 0%
Free (cachelist) 32993 128 6%
Free (freelist) 157473 615 30%
Total 521838 2038
Physical 521837 2038
memstat uses the page walker to dump everything up to Free (freelist).
Here is a check on the page list for free pages
::walk page | ::print page_t p_state ! egrep '80|90|c0|a0' | wc
32985 98955 494775 <-- 32985 free pages (almost exact with Free
(cachelist))
So, how many pages on on the page list...
::walk page ! wc <-- this is the walker ::memstat uses to examine pages
367695 367695 3309255 <-- so 367695 page_t (i.e., pageable pages)
however,
physmem::print -d <-- variable used for Total in ::memstat
0t521838 <-- but 521838 pages of physical memory
So, where are the 154143 pages? (This is about 600MB of memory on my
2GB machine).
It turns out that the page walker only walks pages that are "hashed",
i.e., have vnode/offset
identity. A page that does not have an identity is not listed. (To see
all pages, you
can use ::memseg_list and go from there... left as an exercise. I've
done this and now
get a total number of pages in agreement with physmem.).
To see pages used by the kernel (not counting zfs), you can do:
::walk page | ::print page_t p_vnode !grep kvp | wc
116728 350184 1634192 <-- so 116728 kernel pages (note that this was
done a while after the
<-- ::memstat above)
Pages used for zfs data may use the zvp vnode (but not on my build), so
you can use the above walker and substitute zvp for kvp.
I assume you are using something like "ps -e -o rss,comm" to dump RSS
numbers
for processes. Remember that many of the pages are shared between
processes.
This only really causes trouble because if you try to offer a sysadmin a
breakdown of memory it is:
1) Approx within maybe 5% of reality
2) Must be pulled from a single source (kstat?) or the numbers don't
even line up as approx
And the problem is that this is hard for an end user to swallow... if
they attempt to check your math they'll see its not right and discard
the data as useless.
Thus, perhaps the best take away is that this is why there is no Solaris
"memstat" tool for the CLI short of using mdb, and that the more
accurate observations are based on how memory is changing
(vmstat/mpstat) rather than absolutely what it is at any given point
(::memstat).
This I agree with.
Am I accurate here or out to lunch?
uhhh... yes(?)
max
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org