Hi all This is just an off-the-cuff idea at the moment, but I would like to sound it out.
Consider the situation where someone has a large amount of off-site data storage (of the order of 100s of TB or more). They have a slow network link to this storage. My idea is that this could be used to build the main vdevs for a ZFS pool. On top of this, an array of disks (of the order of TBs to 10s of TB) is available locally, which can be used as L2ARC. There are also smaller, faster arrays (of the order of 100s of GB) which, in my mind, could be used as a ZIL. Now, in this theoretical situation, in-play read data is kept on the L2ARC, and can be accessed about as fast as if this array was just used as the main pool vdevs. Written data goes to the ZIL, as is then sent down the slow link to the offsite storage. Rarely used data is still available as if on site (shows up in the same file structure), but is effectively "archived" to the offsite storage. Now, here comes the problem. According to what I have read, the maximum size for the ZIL is approx 50% of the physical memory in the system, which would be too small for this particular situation. Also, you cannot mirror the L2ARC, which would have dire performance consequences in the case of a disk failure in the L2ARC. I also believe (correct me if I am wrong) that the L2ARC is invalidated on reboot, so would have to "warm up" again). And finally, if the network link was to die, I am assuming the entire ZPool would become unavailable. This is a setup which I can see many use cases for, but it introduces too many failure modes. What I would like to see is an extension to ZFS's hierarchical storage environment, such that an additional layer can be put behind the main pool vdevs as an "archive" store (i.e. it goes [ARC]->[L2ARC/ZIL]->[main]->[archive]). Infrequently used files/blocks could be pushed into this storage, but appear to be available as normal. It would, for example, allow old snapshot data to be pushed down, as this is very rarely going to be used, or files which must be archived for legal reasons. It would also utilise the bandwidth available more efficiently, as only data being specifically sent to it would need transferring. In the case where the archive storage becomes unavailable, there would be a number of possible actions (e.g. error on access, block on access, make the files "disappear" temporarily). I know there are already solutions out there which do similar jobs. The company I work for use one which pushes "archive" data to a tape stacker, and pulls it back when accessed. But I think this is a ripe candidate for becoming part of the ZFS stack. So, what does everyone think? Rgds Karl _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss