On 6/26/06, Neil Perrin <[EMAIL PROTECTED]> wrote:
Robert Milkowski wrote On 06/25/06 04:12,: > Hello Neil, > > Saturday, June 24, 2006, 3:46:34 PM, you wrote: > > NP> Chris, > > NP> The data will be written twice on ZFS using NFS. This is because NFS > NP> on closing the file internally uses fsync to cause the writes to be > NP> committed. This causes the ZIL to immediately write the data to the intent log. > NP> Later the data is also written committed as part of the pools transaction group > NP> commit, at which point the intent block blocks are freed. > > NP> It does seem inefficient to doubly write the data. In fact for blocks > NP> larger than zfs_immediate_write_sz (was 64K but now 32K after 6440499 fixed) > NP> we write the data block and also an intent log record with the block pointer. > NP> During txg commit we link this block into the pool tree. By experimentation > NP> we found 32K to be the (current) cutoff point. As the nfsd at most write 32K > NP> they do not benefit from this. > > Is 32KB easily tuned (mdb?)? I'm not sure. NFS folk?
I think he is referring to the zfs_immediate_write_sz variable, but NFS will support larger block sizes as well. Unfortunately, since the maximum IP datagram size is 64k, after headers are taken into account, the largest useful value is 60k. If this is to be laid out as an indirect write, will it be written as 32k+16k+8k+4k blocks? If so, this seems like it would be quite inefficient for RAID-Z, and writes would best be left at 32k. Chris _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss