On Dec 5, 2012, at 5:41 AM, Jim Klimov <jimkli...@cos.ru> wrote: > On 2012-12-05 04:11, Richard Elling wrote: >> On Nov 29, 2012, at 1:56 AM, Jim Klimov <jimkli...@cos.ru >> <mailto:jimkli...@cos.ru>> wrote: >> >>> I've heard a claim that ZFS relies too much on RAM caching, but >>> implements no sort of priorities (indeed, I've seen no knobs to >>> tune those) - so that if the storage box receives many different >>> types of IO requests with different "administrative weights" in >>> the view of admins, it can not really throttle some IOs to boost >>> others, when such IOs have to hit the pool's spindles. >> >> Caching has nothing to do with QoS in this context. *All* modern >> filesystems cache to RAM, otherwise they are unusable. > > Yes, I get that. However, many systems get away with less RAM > than recommended for ZFS rigs (like the ZFS SA with a couple > hundred GB as the starting option), and make their compromises > elsewhere. They have to anyway, and they get different results, > perhaps even better suited to certain narrow or big niches.
This is nothing more than a specious argument. They have small caches, so their performance is not as good as those with larger caches. This is like saying you need a smaller CPU cache because larger CPU caches get full. > Whatever the aggregate result, this difference does lead to > some differing features that The Others' marketing trumpets > praise as the advantage :) - like this ability to mark some > IO traffic as of higher priority than other traffics, in one > case (which is now also an Oracle product line, apparently)... > > Actually, this question stems from a discussion at a seminar > I've recently attended - which praised ZFS but pointed out its > weaknesses against some other players on the market, so we are > not unaware of those. > >>> For example, I might want to have corporate webshop-related >>> databases and appservers to be the fastest storage citizens, >>> then some corporate CRM and email, then various lower priority >>> zones and VMs, and at the bottom of the list - backups. >> >> Please read the papers on the ARC and how it deals with MFU and >> MRU cache types. You can adjust these policies using the primarycache >> and secondarycache properties at the dataset level. > > I've read on that, and don't exactly see how much these help > if there is pressure on RAM so that cache entries expire... > Meaning, if I want certain datasets to remain cached as long > as possible (i.e. serve website or DB from RAM, not HDD), at > expense of other datasets that might see higher usage, but > have lower business priority - how do I do that? Or, perhaps, > add (L2)ARC shares, reservations and/or quotas concepts to the > certain datasets which I explicitly want to throttle up or down? MRU evictions take precedence over MFU evictions. If the data is not in MFU, then it is, by definition, not being frequently used. > At most, now I can mark the lower-priority datasets' data or > even metadata as not cached in ARC or L2ARC. On-off. There seems > to be no smaller steps, like in QoS tags [0-7] or something like > that. > > BTW, as a short side question: is it a true or false statement, > that: if I set primarycache=metadata, then ZFS ARC won't cache > any "userdata" and thus it won't appear in (expire into) L2ARC? > So the real setting is that I can cache data+meta in RAM, and > only meta in SSD? Not the other way around (meta in RAM but > both data+meta in SSD)? That is correct, by my reading of the code. >>> >>> AFAIK, now such requests would hit the ARC, then the disks if >>> needed - in no particular order. Well, can the order be made >>> "particular" with current ZFS architecture, i.e. by setting >>> some datasets to have a certain NICEness or another priority >>> mechanism? >> >> ZFS has a priority-based I/O scheduler that works at the DMU level. >> However, there is no system call interface in UNIX that transfers >> priority or QoS information (eg read() or write()) into the file system VFS >> interface. So the grainularity of priority control is by zone or dataset. > > I do not think I've seen mention of priority controls per dataset, > at least not in generic ZFS. Actually, that was part of my question > above. And while throttling or resource shares between higher level > software components (zones, VMs) might have similar effect, this is > not something really controlled and enforced by the storage layer. The priority scheduler is by type of I/O request. For example, sync requests have priority over async requests. Reads and writes have priority over scrubbing etc. The inter-dataset scheduling is done at the zone level. There is more work being done in this area, but it is still in the research phase. -- richard -- richard.ell...@richardelling.com +1-760-896-4422
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss