zfs vm

Steven Hartland Fri, 29 Aug 2014 12:57:48 -0700

On Friday 29 August 2014 11:54:42 Alan Cox wrote:

snip...

> > Others have also confirmed that even with r265945 they can still> > trigger
> > performance issue.
> >
> > In addition without it we still have loads of RAM sat their> > unused, in my> > particular experience we have 40GB of 192GB sitting their unused> > and that
> > was with a stable build from last weekend.
>
> The Solaris code only imposed this limit on 32-bit machines where> the
> available kernel virtual address space may be much less than the
> available physical memory. Previously, FreeBSD imposed this limit> on> both 32-bit and 64-bit machines. Now, it imposes it on neither.> Why
> continue to do this differently from Solaris?

My understanding is these limits where totally different on Solaris seethe#ifdef sun block in arc_reclaim_needed() for details. I actually startedatmatching the Solaris flow but this had already been tested and provednot

to work as well as the current design.

Since the question was asked below, we don't have zfs machines in theclusterrunning i386. We can barely get them to boot as it is due to kvapressure.We have to reduce/cap physical memory and change the user/kernelvirtual split
from 3:1 to 2.5:1.5.
We do run zfs on small amd64 machines with 2G of ram, but I can'timagine it
working on the 10G i386 PAE machines that we have.
> > With the patch we confirmed that both RAM usage and performance> > for those
> > seeing that issue are resolved, with no reported regressions.
> >
> >> (I should know better than to fire a reply off before full fact
> >> checking, but
> >> this commit worries me..)
> >
> > Not a problem, its great to know people pay attention to changes,> > and
> > raise
> > their concerns. Always better to have a discussion about potential> > issues
> > than to wait for a problem to occur.
> >
> > Hopefully the above gives you some piece of mind, but if you still
> > have any
> > concerns I'm all ears.
>
> You didn't really address Peter's initial technical issue.  Peter
> correctly observed that cache pages are just another flavor of free
> pages.  Whenever the VM system is checking the number of free pages
> against any of the thresholds, it always uses the sum of> v_cache_count
> and v_free_count.  So, to anyone familiar with the VM system, like
> Peter, what you've done, which is to derive a threshold from
> v_free_target but only compare v_free_count to that threshold, looks
> highly suspect.

I think I'd like to see something like this:

Index: cddl/compat/opensolaris/kern/opensolaris_kmem.c
===================================================================
--- cddl/compat/opensolaris/kern/opensolaris_kmem.c (revision 270824)
+++ cddl/compat/opensolaris/kern/opensolaris_kmem.c (working copy)
@@ -152,7 +152,8 @@
 kmem_free_count(void)
 {

- return (vm_cnt.v_free_count);
+ /* "cache" is just a flavor of free pages in FreeBSD */
+ return (vm_cnt.v_free_count + vm_cnt.v_cache_count);
 }

 u_int


This has apparently already been tried and the response from Karl was:

- No, because memory in "cache" is subject to being either reallocatedor freed.- When I was developing this patch that was my first impression as welland how

- I originally coded it, and it turned out to be wrong.
-

- The issue here is that you have two parts of the system contending forRAM --- the VM system generally, and the ARC cache. If the ARC cache freesspace before- the VM system activates and starts pruning then you wind up with theARC pinned

- at the minimum after some period of time, because it releases "early."

I've asked him if he would retest just to be sure.

The rest of the system looks at the "big picture" it would be happy tolet the"free" pool run quite a way down so long as there's "cache" pagesavailable tosatisfy the free space requirements. This would lead ZFS tomistakenlysacrifice ARC for no reason. I'm not sure how big a deal this is, butI can'timagine many scenarios where I want ARC to be discarded in order tosave some
effectively free pages.

From Karl's response from the original PR (above) it seems like this

causes
unexpected behaviour due to the two systems being seperate.

> That said, I can easily believe that your patch works better than> the> existing code, because it is closer in spirit to my interpretation> of> what the Solaris code does. Specifically, I believe that the> Solaris
> code starts trimming the ARC before the Solaris page daemon starts
> writing dirty pages to secondary storage. Now, you've made FreeBSD> do
> the same.  However, you've expressed it in a way that looks broken.
>
> To wrap up, I think that you can easily write this in a way that
> simultaneously behaves like Solaris and doesn't look wrong to a VM> expert.
>
> > Out of interest would it be possible to update machines in the> > cluster to
> > see how their workload reacts to the change?
> >
I'd like to see the free vs cache thing resolved first but it's goingto be
tricky to get a comparison.

Does Karl's explaination as to why this doesn't work above change yourmind?

For the first few months of the year, things were really troublesome.It was
quite easy to overtax the machines and run them into the ground.
This is not the case now - things are working pretty well underpressure
(prior to the commit).  Its got to the point that we feel comfortable
thrashing the machines really hard again. Getting a comparison whenit
already works well is going to be tricky.

We don't have large memory machines that aren't already tuned for
vfs.zfs.arc_max caps for tmpfs use.
For context to the wider audience, we do not run -release or -pN inthefreebsd cluster. We mostly run -current, and some -stable. I amwell awarethat there is significant discomfort in 10.0-R with zfs but we alreadyhave thefixes for that.


_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r270759 - in head/sys: cddl/compat/opensolaris/kern cddl/compat/opensolaris/sys cddl/contrib/opensolaris/uts/common/fs/zfs vm

Reply via email to