And therein lies the issue. The excessive load that causes the IO issues is almost always generated locally from a scrub or a local recursive "ls" used to warm up the SSD-based zpool cache with metadata. The regular network IO to the box is minimal and is very read-centric; once we load the box up with archived data (which generally happens in a short amount of time), we simply serve it out as needed.
As far as queueing goes, I would expect the system to queue bursts of IO in memory with appropriate timeouts, as required. These timeouts could either be manually or auto-magically adjusted to deal with the slower storage hardware. Obviously sustained intense IO requests would eventually blow up the queue so the goal here is to avoid creating those situations in the first place. We can throttle the network IO, if needed; I need the OS to know it's own local IO boundaries though and not attempt to overwork itself during scrubs etc. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss