On Thu, Jun 25, 2009 at 12:18:04PM -0700, Sean Liu wrote:
> Correct me if I am wrong. If the applications request memory from O/S,
> the kernel will cut back ZFS cache - so in this sense the cache is
> free, right?

Nope.  It doesn't work this way.  ZFS has a thread that runs
periodically, and is invoked in situations where ZFS needs to adjust its
own memory consumption.  If vm beancounters tell ZFS that its memory
usage may be high, it will try to release space.  Generally, it's
successful, but right now there isn't any deterministic way to figure
out how much space a reap operation will return.  This is different from
the freelists that UFS uses, where it's pages are available for
use/reuse once they're on the list.

> Yes ZFS cache is considered kernel memory, but as long as kenel will
> cut back its size and give back to applications, it'll be considered
> free.

If you take this approch, it's possible that we could report memory as
free that applications might not be able to allocate.  I'd rather take a
conservative approach and unreport instead of overreport.

> Yes ZFS cahce is in use, but so is UFS filesystem cache.

Take a look at page_create_va() in vm_page.c.  This is the vm's routine
that allocates virtual pages for the operating system.  In its loop,
this routine examines the page freelists and cachelists.  It doesn't
look at any of the pages that ZFS uses, because their vmem is actually
allocated somewhere else and in use by the kmem caches.  Until the kmem
caches reap their slabs, and free the associated vmem, the operating
system isn't able to reallocate these as other pages.

> I am not sure what you meant by "how much memory is wasted due to
> fragmentation", but if they are really wasted, then it's something
> else to enhance.

Yes, it is and the task is non-trivial.  Tom Erickson is working on the
dnode consolidator, which will coalese fragemented dnodes into more
tightly packed slabs so that there's less kernel heap fragementation.
However, in order to relocate objects, each kmem cache needs to
implement its own subsystem-specific set of move operations.  Take a
look at how many kmem caches there are to get a sense of the amount of
work that's required.

> One way or another, on any given system with ZFS, there must be a
> certain size of cache memory that can be given back

There is a certain size, but you won't know it until you perform the
reap operation. 

> if we can track the number with a reasonable amount of effort

That's the problem -- there's not a good way to compute this, aside from
performing the reap and figuring out how much additonal memory is free.

-j
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to