On 2012-12-05 04:11, Richard Elling wrote:
On Nov 29, 2012, at 1:56 AM, Jim Klimov <jimkli...@cos.ru
<mailto:jimkli...@cos.ru>> wrote:

I've heard a claim that ZFS relies too much on RAM caching, but
implements no sort of priorities (indeed, I've seen no knobs to
tune those) - so that if the storage box receives many different
types of IO requests with different "administrative weights" in
the view of admins, it can not really throttle some IOs to boost
others, when such IOs have to hit the pool's spindles.

Caching has nothing to do with QoS in this context. *All* modern
filesystems cache to RAM, otherwise they are unusable.

Yes, I get that. However, many systems get away with less RAM
than recommended for ZFS rigs (like the ZFS SA with a couple
hundred GB as the starting option), and make their compromises
elsewhere. They have to anyway, and they get different results,
perhaps even better suited to certain narrow or big niches.

Whatever the aggregate result, this difference does lead to
some differing features that The Others' marketing trumpets
praise as the advantage :) - like this ability to mark some
IO traffic as of higher priority than other traffics, in one
case (which is now also an Oracle product line, apparently)...

Actually, this question stems from a discussion at a seminar
I've recently attended - which praised ZFS but pointed out its
weaknesses against some other players on the market, so we are
not unaware of those.

For example, I might want to have corporate webshop-related
databases and appservers to be the fastest storage citizens,
then some corporate CRM and email, then various lower priority
zones and VMs, and at the bottom of the list - backups.

Please read the papers on the ARC and how it deals with MFU and
MRU cache types. You can adjust these policies using the primarycache
and secondarycache properties at the dataset level.

I've read on that, and don't exactly see how much these help
if there is pressure on RAM so that cache entries expire...
Meaning, if I want certain datasets to remain cached as long
as possible (i.e. serve website or DB from RAM, not HDD), at
expense of other datasets that might see higher usage, but
have lower business priority - how do I do that? Or, perhaps,
add (L2)ARC shares, reservations and/or quotas concepts to the
certain datasets which I explicitly want to throttle up or down?

At most, now I can mark the lower-priority datasets' data or
even metadata as not cached in ARC or L2ARC. On-off. There seems
to be no smaller steps, like in QoS tags [0-7] or something like
that.

BTW, as a short side question: is it a true or false statement,
that: if I set primarycache=metadata, then ZFS ARC won't cache
any "userdata" and thus it won't appear in (expire into) L2ARC?
So the real setting is that I can cache data+meta in RAM, and
only meta in SSD? Not the other way around (meta in RAM but
both data+meta in SSD)?


AFAIK, now such requests would hit the ARC, then the disks if
needed - in no particular order. Well, can the order be made
"particular" with current ZFS architecture, i.e. by setting
some datasets to have a certain NICEness or another priority
mechanism?

ZFS has a priority-based I/O scheduler that works at the DMU level.
However, there is no system call interface in UNIX that transfers
priority or QoS information (eg read() or write()) into the file system VFS
interface. So the grainularity of priority control is by zone or dataset.

I do not think I've seen mention of priority controls per dataset,
at least not in generic ZFS. Actually, that was part of my question
above. And while throttling or resource shares between higher level
software components (zones, VMs) might have similar effect, this is
not something really controlled and enforced by the storage layer.

  -- richard

Thanks,
//Jim


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to