I moved my main workspaces over to ZFS a while ago and noticed that my disk got really noisy (yes, one of those subjective measurements). It sounded like the head was being bounced around a lot at the end of each transaction group.
Today I grabbed the iosnoop dtrace script (from <http://www.opensolaris.org/os/community/dtrace/scripts/>) and looked a little at the output. It's strange, it looks as if the blocks are being written to disk in nearly random order. I have a two-vdev pool, just plain disk slices, no mirroring etc. (I'm not using whole disks because I've just got the two disks in my workstation and my root is still on UFS.) If I use 'dd' to create a 1MB file out of 1KB writes and wait for it to be pushed to disk, one of the two disks sees a block stream like: 27610929:1 27610930:3 27610933:9 27610942:13 39425458:13 <-- huh? 27565952:16 <-- now we've gone backwards 39400576:16 27463484:4 39342412:4 27581454:2 39382602:2 27581456:2 ... So the head of this disk is happily bouncing back and forth at this point (well, they're FC disks with a reasonably deep queue, so it's not so bad as it could be, but it's still not great). The other disk is behaving a little better, but still moving back and forth between two block ranges. Before I find some time to go dig into the intricacies of the I/O scheduler, any hints as to why this might be happening? My intuition would be that we ought to be able to write the blocks out in arbitrary order since it's only the überblock write which commits them, so we should be able to use an always-move-forward ordering (and, of course, let the disk do its own scheduling within that). Also, why the very small adjacent writes? Those first four writes in the snoop pushed out 13K of data using 4 separate write operations, which is wasteful. (There are others too, e.g. towards the end of the excerpt above we're doing two 1K writes to adjacent blocks.) Does the scheduler attempt to perform coalescing as well? (I should mention that this is S10U2 so there have certainly been fixes since.) This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss