Blue Thunder Somogyi wrote:
I'm curious if there has been any discussion of or work done toward implementing storage classing within zpools (this would be similar to the storage foundation QoSS feature).
There has been some discussion. AFAIK, there is no significant work in progress. This problem is far more complex to solve than it may first appear.
I've searched the forum and inspected the documentation looking for a means to do this, and haven't found anything, so pardon the post if this is redundant/superfluous. I would imagine this would require something along the lines of: a) the ability to catagorize devices in a zpool with thier "class of storage", perhaps a numeric rating or otherwise, with the idea that the fastest disks get a "1" and the slowest get a "9" (or whatever the largest number of supported tiers would be)
This gets more complicated when devices are very asymmetric in performance. For a current example, consider an NVRAM-backed RAID array. Writes tend to complete very quickly, regardless of the offset. But reads can vary widely, and may be an order of magnitude slower. However, this will not be consistent as many of these arrays also cache reads (like JBOD track buffer caches). Today, there are some devices which may demonstrate 2 or more orders of magnitude difference between read and write latency.
b) leveraging the copy-on-write nature of ZFS, when data is modified, the new copy would be sent to the devices that were appropriate given statistical information regarding that data's access/modification frequency. Not being familiar with ZFS internals, I don't know if there would be a way of taking advantage of the ARC knowledge of access frequency.
I think the data is there. This gets further complicated when a vdev shares a resource with another vdev. A shared resource may not be visible to Solaris at all, so it would be difficult (or wrong) for Solaris to make a policy with incorrect assumptions about resource constraints.
c) It seems to me there would need to be some trawling of the storage tiers (probably only the fastest, as the COW migration of frequently accessed data to fast disk would not have an analogously inherent mechanism to move idle data down a tier) to locate data that is gathering cobwebs and stage it down to an appropriate tier. Obviously it would be nice to have as much data as possible on the fastest disks, while leaving all the free space on the dog disks, but would also want to avoid any "write twice" behavior (not enough space on appropriate tier so staged to slower tier and migrated up to faster disk) due to the fastest tier being overfull.
When I follow this logical progression, I arrive at SAM-FS. Perhaps it is better to hook ZFS info SAM-FS?
While zpools are great for dealing with large volumes of data with integrity and minimal management overhead, I've remained concerned about the inabiity to control where data lives when using different types of storage, eg a mix of FC and SATA disk in the extreme, mirror vs RAID-Z2, or as subtle as high RPM small spindles vs low RPM large spindles.
There is no real difference in performance based on the interface: FC vs. SATA. So it would be a bad idea to base a policy on the interface type.
For instance, if you had a database that you know has 100GB of dynamic data and 900GB of more stable data, with the above capabilities you could allocate the appropriate ratio of FC and SATA disk and be confident that the data would naturally migrate to it's appropriate underlying storage. Of course there are ways of using multiple zpools with the different storage types and table spaces to locate the data onto the appropriate zpool, but this is undermining the "minimal management" appeal of ZFS.
The people who tend to really care about performance will do what is needed to get performance, and that doesn't include intentially using slow devices. Perhaps you are thinking of a different market demographic?
Anyhow, just curious if this concept has come up before and if there are any plans around it (or something similar).
Statistically, it is hard to beat stochastically spreading wide and far. -- richard _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss