Now this discussion is getting more and more interesting. So useful or not, I suppose: w/ ZFS, freelist + cachelist + c - c_min would remotely look like w/o ZFS freelist+cache
If this is true, can we at least expose this either through vmstat freemem or some other observeability tool? By doing so we are hardly increasing any risks from the old UFS/VxFS/etc. days, but are giving out somewhat useful information for capacity planners. On the other hand, unlike the CPU related topics, I actually haven't seen a lot memory related capacity planning topics. I don't disagree with the "determining memory consumption requirements rather than the what-is-free" method, I myself do that too. However once one has determined memory consumption requirements, the next logical question is "do you have the free memory on your system to meet the requirements?" Without ZFS, I can be quite confident to say that if I have 8GB freemem reported by vmstat, and the requirements is 4GB, chances are we are fine. If the requirement is 6GB, we are probably fine. If the requirement is 8GB, we should either find a new box or buy some more memory. But if all what I have is 1GB free, it'll be difficult for me. Sean --- On Tue, 6/30/09, Richard Elling <richard.ell...@gmail.com> wrote: > From: Richard Elling <richard.ell...@gmail.com> > Subject: Re: [perf-discuss] perf-discuss Digest, Vol 48, Issue 17 > To: "Sean Liu" <seantem...@yahoo.com> > Cc: perf-discuss@opensolaris.org > Date: Tuesday, June 30, 2009, 4:31 PM > Sean Liu wrote: > > David & Richard, > > > > Thanks for continuing on the discussion under a new > topic. > > I suppose the discussion board software had a little > too much to drink last night ;-) > > > > First of all - David I am not the Sean from SSHA, but > I am in Toronto. I'll send you a note shortly to have a > discussion some time :-) > > > > I can see you actually do understand my points, and > the history of issue, that's good. > > And yes I hate to be reactive and just wait for scan > rate to tell me that I am out of memory. > > > > Richard, > > Before Solaris 8, vmstat freemem only shows freelist, > so after UFS cache took all of the free memory, freelist > will eventually shrink to minfree. And that's the behavior > you talked about. > > However with Solaris 8, vmstat > freemem shows freelist + cachelist, where cachelist is the > UFS cache which is considered FREE because it gives back > memory to applications. > > > > The same is true of the ARC, except that the mechanism for > returning memory to the free pool does not depend on the > page scanner. > > <sidebar> > UFS has a relatively small write buffer and ZFS could > consume > up to 1/8 of memory for writes (located in the ARC). > There is > some question about whether 1/8 of memory is a good guess. > Anecdotal evidence suggests that there are cases where the > 1/8 > limit is too large when you have lots of memory and slow > disks. > If a better rule can be defined, I'm sure it would be > welcome in > the code. > </sidebar> > > > At this moment, ZFS does not have anything like > cachelist to report how much memory to give back so we are > back to reactive mode again. > > > > this would be, to the first order, instantaneously, from > kstat -n arcstats > c - c_min > > This equation does not belong in the vmstat free list > column. > > Yes there are other memory related barriers such as > the large pages you talked about, but they don't cause > drastic performance degradation as paging/swapping does. So > we might consider them in performance tuning area instead of > capacity planning area. > > > > I think you will find a number of people who will > disagree. > > > And yes memory are cheaper and cheaper these days but > with server consolidation and virtualization, the need for > capacity planning is not really decreasing. > > > > ...but by definition, virtualization makes capacity > planning opaque :-) > Everything (or more things) is shared and resource control > at the OS > layer may not be effective. > > > With ::memstat, yes it may be possible to have a rough > idea of free memory but as you said it's quite intrusive and > it requires admin privilege. > > If, with dTrace or whatever tools out there we can > have a clearly or vaguely defined algorithm (such as the one > developed by benr) to help find out free memory again it'll > help customers a lot. > > Just as a real life example, consider two machines, > one with UFS/Vxfs, the other with UFS/ZFS. The first one > can, right or wrong, tell you the rough percentage memory in > used, while the latter one can not. > > Also here's the ::memstat output from ultra5 (don't > laugh, I am still using it) > > > > > >> ::memstat > >> > > Page Summary > Pages > MB %Tot > > ------------ > ---------------- > ---------------- ---- > > Kernel > 40416 > 315 32% > > ZFS File Data > 53697 > 419 42% > > Anon > 23303 > > 182 18% > > Exec and libs > 1108 > 8 1% > > Page cache > 4333 > 33 3% > > Free (cachelist) > 3153 > 24 2% > > Free (freelist) > 2175 > 16 2% > > > > Total > 128185 > 1001 > > Physical > 127490 > 996 > > > > Without ZFS, the two rows marked as Free will give me > an idea of free memory, but it's getting too vague with > ZFS. > > > > OK, but you have no idea of the efficiency of the UFS > cache, so > the only thing you know is that if you completely eliminate > the > UFS cache, performance will stink. And if you drop > below > lotsfree, performance will stink. In other words, you > don't really > know much wrt performance or capacity planning -- you > can't > answer the question of how to size the UFS cache. > > Since the ARC reports cache usage and you can limit its > maximum > size, you can size it. > > Many people take the approach of determining memory > consumption > requirements rather than the what-is-free method. In > the case of ZFS, > you can set the ARC usage policy on a per-file system > basis, which > operates at a finer grain than UFS's directio. Once you > determine > that the policy is correct, then you can begin to size the > ARC needed > for your performance goals. > > Bottom line, vmstat free column is not sufficient for > capacity planning > and has not been sufficient since SunOS 3. > -- richard > > _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org