I think ZFS should look for more opportunities to write to disk rather than leaving it to the last second (5seconds) as it appears it does. e.g.
if a file has record size worth of data outstanding it should be queued within ZFS to be written out. If the record is updated again before a txg, then it can be re-queued (if it has left the queue) and written to the same block or a new block. The write queue would empty when there is spare I/O bandwidth capacity and memory capacity on the disk determined thought outstanding I/Os. Once the data is on disk it could be free to be re-used even before the txg has occurred, but checksum details would need to be recorded first. The txg comes along after X seconds and finds most of the data writes have already happen and only metadata writes are left to do. One would should assume this would help with the delays at txg, talked about in this thread. The example below shows 28 x 128k writes to the same file before anything is written to disk and the disk are idle the entire time. There is no cost to writing to disk if the disk is not doing anything or is under capacity. (Not a perfect example) At the other end maybe updates for access time properties should not be updated to disk until there is some real data to write, or 30minutes has passed to allow green disks to power down for a while. (atime= on|off|delay) Cheers No dedup on, but compression on while sleep 1 ; do echo `dd if=/dev/random of=xxxx bs=128k count=1 2>&1` ; done & iostat -zxcnT d 1 us sy wt id 0 5 0 94 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 53.0 0.0 301.5 0.0 0.2 0.0 3.4 0 4 c5t0d0 0.0 53.0 0.0 301.5 0.0 0.2 0.0 3.1 0 4 c5t2d0 0.0 58.0 0.0 127.0 0.0 0.0 0.0 0.1 0 0 c5t1d0 0.0 58.0 0.0 127.0 0.0 0.0 0.0 0.1 0 0 c5t3d0 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:41 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 3.0 0.0 2.0 0.0 0.0 0.0 0.5 0 0 c5t0d0 0.0 3.0 0.0 2.0 0.0 0.0 0.0 0.5 0 0 c5t2d0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t1d0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t3d0 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:42 PM EST cpu us sy wt id 1 3 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:43 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:44 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:45 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:46 PM EST cpu us sy wt id 1 4 0 95 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:47 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:48 PM EST cpu us sy wt id 0 19 0 80 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:49 PM EST cpu us sy wt id 1 27 0 72 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:50 PM EST cpu us sy wt id 0 3 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:51 PM EST cpu us sy wt id 1 3 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:52 PM EST cpu us sy wt id 0 4 0 95 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:53 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device Monday, 8 March 2010 02:51:54 PM EST cpu us sy wt id 1 3 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:55 PM EST cpu us sy wt id 0 1 0 99 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:56 PM EST cpu us sy wt id 1 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:57 PM EST cpu us sy wt id 1 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:58 PM EST cpu us sy wt id 0 3 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:51:59 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:00 PM EST cpu us sy wt id 1 4 0 95 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:01 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:02 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:03 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:04 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:05 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:06 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:07 PM EST cpu us sy wt id 1 4 0 95 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:08 PM EST cpu us sy wt id 1 3 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:09 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:10 PM EST cpu us sy wt id 1 4 0 95 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 37.0 0.0 140.5 0.0 0.1 0.0 1.9 0 2 c5t0d0 0.0 37.0 0.0 140.5 0.0 0.1 0.0 1.9 0 2 c5t2d0 0.0 41.0 0.0 79.5 0.0 0.0 0.0 0.1 0 0 c5t1d0 0.0 40.0 0.0 79.5 0.0 1.6 0.0 38.8 0 39 c5t3d0 0+1 records in 0+1 records out Monday, 8 March 2010 02:52:11 PM EST cpu us sy wt id 0 4 0 96 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t2d0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t1d0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t3d0 Monday, 8 March 2010 02:52:12 PM EST cpu us sy wt id 0 1 0 99 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0+1 records in 0+1 records out -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss