> The ARC is designed to use as much memory as is available up to a limit.  If
> the kernel allocator needs memory and there is none available, then the
> allocator requests memory back from the zfs ARC. Note that some systems have
> multiple memory allocators.  For example, there may be a memory allocator
> for the network stack, and/or for a filesystem.

Yes, but again I am concerned with what the ARC chooses to cache and
for how long, not how the ARC balances memory with other parts of the
kernel. At least, none of my observations lead me to believe the
latter is the problem here.

> might be pre-allocated.  I assume that you have already read the FreeBSD ZFS
> tuning guide (http://wiki.freebsd.org/ZFSTuningGuide) and the ZFS filesystem
> section in the handbook
> (http://www.freebsd.org/doc/handbook/filesystems-zfs.html) and made sure
> that your system is tuned appropriately.

Yes, I have been tweaking and fiddling and reading off and on since
ZFS was originally added to CURRENT.

This is not about tuning in that sense. The fact that the little data
necessary to start an 'urxvt' instance does not get cached for at
least 1-2 seconds on an otherwise mostly idle system is either the
result of cache policy, an implementation bug (freebsd or otherwise),
or a matter of an *extremely* small cache size. I have observed this
behavior for a very long time across versions of both ZFS and FreeBSD,
and with different forms of arc sizing tweaks.

It's entirely possibly there are FreeBSD issues preventing the ARC to
size itself appropriately. What I am saying though is that all
indications are that data is not being selected for caching at all, or
else is evicted extremely quickly, unless sufficient "frequency" has
been accumulated to, presumably, make the ARC decide to cache the
data.

This is entirely what I would expect from a caching policy that tries
to adapt to long-term access patterns and avoid pre-mature cache
eviction by looking at frequency of access. I don't see what it is
that is so outlandish about my query. These are fundamental ways in
which caches of different types behave, and there is a legitimate
reason to not use the same cache eviction policy under all possible
workloads. The behavior I am seeing is consistent with a caching
policy that tries "too hard" (for my particular use case) to avoid
eviction in the face of short-term changes in access pattern.

> There have been a lot of eyeballs looking at how zfs does its caching, and a
> ton of benchmarks (mostly focusing on server thoughput) to verify the
> design.  While there can certainly be zfs shortcomings (I have found
> several) these are few and far between.

That's a very general statement. I am talking about specifics here.
For example, you can have mountains of evidence that shows that a
plain LRU is "optimal" (under some conditions). That doesn't change
the fact that if I want to avoid a sequential scan of a huge data set
to completely evict everything in the cache, I cannot use a plain LRU.

In this case I'm looking for the reverse; i.e., increasing the
importance of 'recenticity' because my workload is such that it would
be more optimal than the behavior I am observing. Benchmarks are
irrelevant except insofar as they show that my problem is not with the
caching policy, since I am trying to address an empirically observed
behavior.

I *will* try to look at how the ARC sizes itself, as I'm unclear on
several things in the way memory is being reported by FreeBSD, but as
far as I can tell these are different issues. Sure, a bigger ARC might
hide the behavior I happen to see; but I want the cache to behave in a
way where I do not need gigabytes of extra ARC size to "lure" it into
caching the data necessary for 'urxvt' without having to start it 50
times in a row to accumulate statistics.

-- 
/ Peter Schuller
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to