On Mon, Feb 7, 2011 at 10:26 AM, Roch <roch.bourbonn...@oracle.com> wrote: > > Le 7 févr. 2011 à 06:25, Richard Elling a écrit : > >> On Feb 5, 2011, at 8:10 AM, Yi Zhang wrote: >> >>> Hi all, >>> >>> I'm trying to achieve the same effect of UFS directio on ZFS and here >>> is what I did: >> >> Solaris UFS directio has three functions: >> 1. improved async code path >> 2. multiple concurrent writers >> 3. no buffering >> >> Of the three, #1 and #2 were designed into ZFS from day 1, so there is >> nothing >> to set or change to take advantage of the feature. >> >>> >>> 1. Set the primarycache of zfs to metadata and secondarycache to none, >>> recordsize to 8K (to match the unit size of writes) >>> 2. Run my test program (code below) with different options and measure >>> the running time. >>> a) open the file without O_DSYNC flag: 0.11s. >>> This doesn't seem like directio is in effect, because I tried on UFS >>> and time was 2s. So I went on with more experiments with the O_DSYNC >>> flag set. I know that directio and O_DSYNC are two different things, >>> but I thought the flag would force synchronous writes and achieve what >>> directio does (and more). >> >> Directio and O_DSYNC are two different features. >> >>> b) open the file with O_DSYNC flag: 147.26s >> >> ouch > > how big a file ? > Does the resuld holds if you don't truncate ? > > -r > The file is 8K*10000 about 80M. I removed the O_TRUNC flag and the results stayed the same...
>> >>> c) same as b) but also enabled zfs_nocacheflush: 5.87s >> >> Is your pool created from a single HDD? >> >>> My questions are: >>> 1. With my primarycache and secondarycache settings, the FS shouldn't >>> buffer reads and writes anymore. Wouldn't that be equivalent to >>> O_DSYNC? Why a) and b) are so different? >> >> No. O_DSYNC deals with when the I/O is committed to media. >> >>> 2. My understanding is that zfs_nocacheflush essentially removes the >>> sync command sent to the device, which cancels the O_DSYNC flag. Why >>> b) and c) are so different? >> >> No. Disabling the cache flush means that the volatile write buffer in the >> disk is not flushed. In other words, disabling the cache flush is in direct >> conflict with the semantics of O_DSYNC. >> >>> 3. Does ZIL have anything to do with these results? >> >> Yes. The ZIL is used for meeting the O_DSYNC requirements. This has >> nothing to do with buffering. More details are on the ZFS Best Practices >> Guide. >> -- richard >> >>> >>> Thanks in advance for any suggestion/insight! >>> Yi >>> >>> >>> #include <fcntl.h> >>> #include <sys/time.h> >>> >>> int main(int argc, char **argv) >>> { >>> struct timeval tim; >>> gettimeofday(&tim, NULL); >>> double t1 = tim.tv_sec + tim.tv_usec/1000000.0; >>> char a[8192]; >>> int fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0660); >>> //int fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC|O_DSYNC, 0660); >>> if (argv[2][0] == '1') >>> directio(fd, DIRECTIO_ON); >>> int i; >>> for (i=0; i<10000; ++i) >>> pwrite(fd, a, sizeof(a), i*8192); >>> close(fd); >>> gettimeofday(&tim, NULL); >>> double t2 = tim.tv_sec + tim.tv_usec/1000000.0; >>> printf("%f\n", t2-t1); >>> } >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss@opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss