bug fix below... On Dec 5, 2012, at 1:10 PM, Richard Elling <richard.ell...@gmail.com> wrote:
> On Dec 5, 2012, at 7:46 AM, Matt Van Mater <matt.vanma...@gmail.com> wrote: > >> I don't have anything significant to add to this conversation, but wanted to >> chime in that I also find the concept of a QOS-like capability very >> appealing and that Jim's recent emails resonate with me. You're not alone! >> I believe there are many use cases where a granular prioritization that >> controls how ARC, L2ARC, ZIL and underlying vdevs are used to give priority >> IO to a specific zvol, share, etc would be useful. My experience is >> stronger in the networking side and I envision a weighted class based >> queuing methodology (or something along those lines). I recognize that >> ZFS's architecture preference for coalescing writes and reads into larger >> sequential batches might conflict with a QOS-like capability... Perhaps the >> ARC/L2ARC tuning might be a good starting point towards that end? > > At present, I do not see async write QoS as being interesting. That leaves > sync writes and reads > as the managed I/O. Unfortunately, with HDDs, the variance in response time > >> queue management > time, so the results are less useful than the case with SSDs. Control theory > works, once again. > For sync writes, they are often latency-sensitive and thus have the highest > priority. Reads have > lower priority, with prefetch reads at lower priority still. > >> >> On a related note (maybe?) I would love to see pool-wide settings that >> control how aggressively data is added/removed form ARC, L2ARC, etc. > > Evictions are done on an as-needed basis. Why would you want to evict more > than needed? > So you could fetch it again? > > Prefetching can be more aggressive, but we actually see busy systems > disabling prefetch to > improve interactive performance. Queuing theory works, once again. > >> Something that would accelerate the warming of a cold pool of storage or be >> more aggressive in adding/removing cached data on a volatile dataset (e.g. >> where Virtual Machines are turned on/off frequently). I have heard that >> some of these defaults might be changed in some future release of Illumos, >> but haven't seen any specifics saying that the idea is nearing fruition in >> release XYZ. > > It is easy to warm data (dd), even to put it into MRU (dd + dd). For best > performance with > VMs, MRU works extremely well, especially with clones. Should read: It is easy to warm data (dd), even to put it into MFU (dd + dd). For best performance with VMs, MFU works extremely well, especially with clones. -- richard > > There are plenty of good ideas being kicked around here, but remember that to > support > things like QoS at the application level, the applications must be written to > an interface > that passes QoS hints all the way down the stack. Lacking these interfaces, > means that > QoS needs to be managed by hand... and that management effort must be worth > the effort. > -- richard > >> >> Matt >> >> >> On Wed, Dec 5, 2012 at 10:26 AM, Jim Klimov <jimkli...@cos.ru> wrote: >> On 2012-11-29 10:56, Jim Klimov wrote: >> For example, I might want to have corporate webshop-related >> databases and appservers to be the fastest storage citizens, >> then some corporate CRM and email, then various lower priority >> zones and VMs, and at the bottom of the list - backups. >> >> On a side note, I'm now revisiting old ZFS presentations collected >> over the years, and one suggested as "TBD" statements the ideas >> that metaslabs with varying speeds could be used for specific >> tasks, and not only to receive the allocations first so that a new >> pool would perform quickly. I.e. "TBD: Workload specific freespace >> selection policies". >> >> Say, I create a new storage box and lay out some bulk file, backup >> and database datasets. Even as they are receiving their first bytes, >> I have some idea about the kind of performance I'd expect from them - >> with QoS per dataset I might destine the databases to the fast LBAs >> (and smaller seeks between tracks I expect to use frequently), and >> the bulk data onto slower tracks right from the start, and the rest >> of unspecified data would grow around the middle of the allocation >> range. >> >> These types of data would then only "creep" onto the less fitting >> metaslabs (faster for bulk, slower for DB) if the target ones run >> out of free space. Then the next-best-fitting would be used... >> >> This one idea is somewhat reminiscent of hierarchical storage >> management, except that it is about static allocation at the >> write-time and takes place within the single disk (or set of >> similar disks), in order to warrant different performance for >> different tasks. >> >> ///Jim >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > -- > > richard.ell...@richardelling.com > +1-760-896-4422 > > > > > > > > > -- richard.ell...@richardelling.com +1-760-896-4422
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss