> But is seems that when we're talking about full block > writes (such as > sequential file writes) ZFS could do a bit better. > > And as long as there is bandwidth left to the disk > and the controllers, it > is difficult to argue that the work is redundant. If > it's free in that > sense, it doesn't matter whether it is redundant. > But if it turns out NOT > o have been redundant you save a lot. >
I think this is why an adaptive algorithm makes sense ... in situations where frequent, progressive small writes are engaged by an application, the amount of redundant disk access can be significant, and longer consolidation times may make sense ... larger writes (>= the FS block size) would benefit less from longer consolidation times, and shorter thresholds could provide more usable bandwidth to get a sense of the issue here, I've done some write testing to previously written files in a ZFS file system, and the choice of write element size shows some big swings in actual vs data-driven bandwidth when I launch a set of threads each of which writes 4KB buffers sequentially to its own file, I observe that for 60GB of application writes, the disks see 230+GB of IO (reads and writes): data-driven BW =~41MB/Sec (my 60GB in ~1500 Sec) actual BW =~157 MB/Sec (the 230+GB in ~1500 Sec) if I do the same writes with 128KB buffers (block size of my pool), the same 60GBs of writes only generate 95GB of disk IO (reads and writes) data-driven BW =~85MB/Sec (my 60GB in ~700 Sec) actual BW =~134.6MB/Sec (the 95+GB in ~700 Sec) in the first case, longer consolidation times would have lead to less total IO and better data-driven BW, while in the second case shorter consolidation times would have worked better as far as redundant writes possibly occupying free bandwidth (and thus costing nothing), I think you also have to consider the related costs of additional block scavenging, and less available free space at any specific instant, possibly limiting the sequentiality of the next write ... of course there's also the additional device stress in any case, I agree with you that ZFS could do a better job in this area, but it's not as simple as just looking for large or small IOs ... sequential vs random access patterns also play a big role (as you point out) I expect (hope) the adaptive algorithms will mature over time, eventually providing better behavior over a broader set of operating conditions ... Bill This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss