[zfs-discuss] Shrink the slice used for zpool?
Hi, I recently installed OpenSoalris 200906 on a 10GB primary partition on my laptop. I noticed there wasn't any option for customizing the slices inside the solaris partition. After installation, there was only a single slice (0) occupying the entire partition. Now the problem is that I need to set up a UFS slice for my development. Is there a way to shrink slice 0 (backing storage for the zpool) and make room for a new slice to be used for UFS? I also tried to create UFS on another primary DOS partition, but apparently only one Solaris partition is allowed on one disk. So that failed... Thanks! Yi Zhang ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Shrink the slice used for zpool?
On Mon, Feb 15, 2010 at 1:48 PM, wrote: > >>Hi, >> >>I recently installed OpenSoalris 200906 on a 10GB primary partition on >>my laptop. I noticed there wasn't any option for customizing the >>slices inside the solaris partition. After installation, there was >>only a single slice (0) occupying the entire partition. Now the >>problem is that I need to set up a UFS slice for my development. Is >>there a way to shrink slice 0 (backing storage for the zpool) and make >>room for a new slice to be used for UFS? >> >>I also tried to create UFS on another primary DOS partition, but >>apparently only one Solaris partition is allowed on one disk. So that >>failed... > > > Can you create a zvol and use that for ufs? Slow, but ... > > Casper > > Casper, thanks for the tip! Actually I'm not sure if this would work for me. I wanted to use directio to bypass the file system cache when reading/writing files. That's why I chose UFS instead of ZFS. Now if I create UFS on top of zvol, I'm not sure if a call to directio() would actually do its work... Yi ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Shrink the slice used for zpool?
Thank you, Darren and Richard. I think this gives what I wanted. Yi On Mon, Feb 15, 2010 at 3:13 PM, Darren J Moffat wrote: > On 15/02/2010 19:15, Yi Zhang wrote: >>> >>> Can you create a zvol and use that for ufs? Slow, but ... >>> >>> Casper >>> >>> >> >> Casper, thanks for the tip! Actually I'm not sure if this would work >> for me. I wanted to use directio to bypass the file system cache when >> reading/writing files. That's why I chose UFS instead of ZFS. Now if I >> create UFS on top of zvol, I'm not sure if a call to directio() would >> actually do its work... > > Why not just use ZFS and set the similar options on the ZFS dataset: > zfs set primarycache=metadata > > That is a close approximation to the UFS feature of directio() for bypassing > storing the data in the filesystem cache. > > -- > Darren J Moffat > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS
Hi all, I'm trying to achieve the same effect of UFS directio on ZFS and here is what I did: 1. Set the primarycache of zfs to metadata and secondarycache to none, recordsize to 8K (to match the unit size of writes) 2. Run my test program (code below) with different options and measure the running time. a) open the file without O_DSYNC flag: 0.11s. This doesn't seem like directio is in effect, because I tried on UFS and time was 2s. So I went on with more experiments with the O_DSYNC flag set. I know that directio and O_DSYNC are two different things, but I thought the flag would force synchronous writes and achieve what directio does (and more). b) open the file with O_DSYNC flag: 147.26s c) same as b) but also enabled zfs_nocacheflush: 5.87s My questions are: 1. With my primarycache and secondarycache settings, the FS shouldn't buffer reads and writes anymore. Wouldn't that be equivalent to O_DSYNC? Why a) and b) are so different? 2. My understanding is that zfs_nocacheflush essentially removes the sync command sent to the device, which cancels the O_DSYNC flag. Why b) and c) are so different? 3. Does ZIL have anything to do with these results? Thanks in advance for any suggestion/insight! Yi #include #include int main(int argc, char **argv) { struct timeval tim; gettimeofday(&tim, NULL); double t1 = tim.tv_sec + tim.tv_usec/100.0; char a[8192]; int fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0660); //int fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC|O_DSYNC, 0660); if (argv[2][0] == '1') directio(fd, DIRECTIO_ON); int i; for (i=0; i<1; ++i) pwrite(fd, a, sizeof(a), i*8192); close(fd); gettimeofday(&tim, NULL); double t2 = tim.tv_sec + tim.tv_usec/100.0; printf("%f\n", t2-t1); } ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS
On Mon, Feb 7, 2011 at 12:25 AM, Richard Elling wrote: > On Feb 5, 2011, at 8:10 AM, Yi Zhang wrote: > >> Hi all, >> >> I'm trying to achieve the same effect of UFS directio on ZFS and here >> is what I did: > > Solaris UFS directio has three functions: > 1. improved async code path > 2. multiple concurrent writers > 3. no buffering > Thanks for the comments, Richard. All I wanted is to achieve 3 on ZFS. But as I said, apprently 2.a) below didn't give me that. Do you have any suggestion? > Of the three, #1 and #2 were designed into ZFS from day 1, so there is nothing > to set or change to take advantage of the feature. > >> >> 1. Set the primarycache of zfs to metadata and secondarycache to none, >> recordsize to 8K (to match the unit size of writes) >> 2. Run my test program (code below) with different options and measure >> the running time. >> a) open the file without O_DSYNC flag: 0.11s. >> This doesn't seem like directio is in effect, because I tried on UFS >> and time was 2s. So I went on with more experiments with the O_DSYNC >> flag set. I know that directio and O_DSYNC are two different things, >> but I thought the flag would force synchronous writes and achieve what >> directio does (and more). > > Directio and O_DSYNC are two different features. > >> b) open the file with O_DSYNC flag: 147.26s > > ouch > >> c) same as b) but also enabled zfs_nocacheflush: 5.87s > > Is your pool created from a single HDD? Yes, it is. Do you have an explanation for the b) case? I also tried O_DSYNC AND directio on UFS, the time is on the same order as directio but no O_DSYNC on UFS (see below). This dramatic difference between UFS and ZFS is puzzling me... UFS: directio=on,no O_DSYNC -> 2s directio=on,O_DSYNC -> 5s ZFS: no caching, no O_DSYNC -> 0.11s no caching, O_DSYNC -> 147s > >> My questions are: >> 1. With my primarycache and secondarycache settings, the FS shouldn't >> buffer reads and writes anymore. Wouldn't that be equivalent to >> O_DSYNC? Why a) and b) are so different? > > No. O_DSYNC deals with when the I/O is committed to media. > >> 2. My understanding is that zfs_nocacheflush essentially removes the >> sync command sent to the device, which cancels the O_DSYNC flag. Why >> b) and c) are so different? > > No. Disabling the cache flush means that the volatile write buffer in the > disk is not flushed. In other words, disabling the cache flush is in direct > conflict with the semantics of O_DSYNC. > >> 3. Does ZIL have anything to do with these results? > > Yes. The ZIL is used for meeting the O_DSYNC requirements. This has > nothing to do with buffering. More details are on the ZFS Best Practices > Guide. > -- richard > >> >> Thanks in advance for any suggestion/insight! >> Yi >> >> >> #include >> #include >> >> int main(int argc, char **argv) >> { >> struct timeval tim; >> gettimeofday(&tim, NULL); >> double t1 = tim.tv_sec + tim.tv_usec/100.0; >> char a[8192]; >> int fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0660); >> //int fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC|O_DSYNC, 0660); >> if (argv[2][0] == '1') >> directio(fd, DIRECTIO_ON); >> int i; >> for (i=0; i<1; ++i) >> pwrite(fd, a, sizeof(a), i*8192); >> close(fd); >> gettimeofday(&tim, NULL); >> double t2 = tim.tv_sec + tim.tv_usec/100.0; >> printf("%f\n", t2-t1); >> } >> ___ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS
On Mon, Feb 7, 2011 at 10:26 AM, Roch wrote: > > Le 7 févr. 2011 à 06:25, Richard Elling a écrit : > >> On Feb 5, 2011, at 8:10 AM, Yi Zhang wrote: >> >>> Hi all, >>> >>> I'm trying to achieve the same effect of UFS directio on ZFS and here >>> is what I did: >> >> Solaris UFS directio has three functions: >> 1. improved async code path >> 2. multiple concurrent writers >> 3. no buffering >> >> Of the three, #1 and #2 were designed into ZFS from day 1, so there is >> nothing >> to set or change to take advantage of the feature. >> >>> >>> 1. Set the primarycache of zfs to metadata and secondarycache to none, >>> recordsize to 8K (to match the unit size of writes) >>> 2. Run my test program (code below) with different options and measure >>> the running time. >>> a) open the file without O_DSYNC flag: 0.11s. >>> This doesn't seem like directio is in effect, because I tried on UFS >>> and time was 2s. So I went on with more experiments with the O_DSYNC >>> flag set. I know that directio and O_DSYNC are two different things, >>> but I thought the flag would force synchronous writes and achieve what >>> directio does (and more). >> >> Directio and O_DSYNC are two different features. >> >>> b) open the file with O_DSYNC flag: 147.26s >> >> ouch > > how big a file ? > Does the resuld holds if you don't truncate ? > > -r > The file is 8K*1 about 80M. I removed the O_TRUNC flag and the results stayed the same... >> >>> c) same as b) but also enabled zfs_nocacheflush: 5.87s >> >> Is your pool created from a single HDD? >> >>> My questions are: >>> 1. With my primarycache and secondarycache settings, the FS shouldn't >>> buffer reads and writes anymore. Wouldn't that be equivalent to >>> O_DSYNC? Why a) and b) are so different? >> >> No. O_DSYNC deals with when the I/O is committed to media. >> >>> 2. My understanding is that zfs_nocacheflush essentially removes the >>> sync command sent to the device, which cancels the O_DSYNC flag. Why >>> b) and c) are so different? >> >> No. Disabling the cache flush means that the volatile write buffer in the >> disk is not flushed. In other words, disabling the cache flush is in direct >> conflict with the semantics of O_DSYNC. >> >>> 3. Does ZIL have anything to do with these results? >> >> Yes. The ZIL is used for meeting the O_DSYNC requirements. This has >> nothing to do with buffering. More details are on the ZFS Best Practices >> Guide. >> -- richard >> >>> >>> Thanks in advance for any suggestion/insight! >>> Yi >>> >>> >>> #include >>> #include >>> >>> int main(int argc, char **argv) >>> { >>> struct timeval tim; >>> gettimeofday(&tim, NULL); >>> double t1 = tim.tv_sec + tim.tv_usec/100.0; >>> char a[8192]; >>> int fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0660); >>> //int fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC|O_DSYNC, 0660); >>> if (argv[2][0] == '1') >>> directio(fd, DIRECTIO_ON); >>> int i; >>> for (i=0; i<1; ++i) >>> pwrite(fd, a, sizeof(a), i*8192); >>> close(fd); >>> gettimeofday(&tim, NULL); >>> double t2 = tim.tv_sec + tim.tv_usec/100.0; >>> printf("%f\n", t2-t1); >>> } >>> ___ >>> zfs-discuss mailing list >>> zfs-discuss@opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >> ___ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS
On Mon, Feb 7, 2011 at 1:06 PM, Brandon High wrote: > On Mon, Feb 7, 2011 at 6:15 AM, Yi Zhang wrote: >> On Mon, Feb 7, 2011 at 12:25 AM, Richard Elling >> wrote: >>> Solaris UFS directio has three functions: >>> 1. improved async code path >>> 2. multiple concurrent writers >>> 3. no buffering >>> >> Thanks for the comments, Richard. All I wanted is to achieve 3 on ZFS. >> But as I said, apprently 2.a) below didn't give me that. Do you have >> any suggestion? > > Don't. Use a ZIL, which will meet the requirements for synchronous IO. > Set primarycache to metadata to prevent caching reads. > > ZFS is a very different beast than UFS and doesn't require the same tuning. > I already set primarycache to metadata, and I'm not concerned about caching reads, but caching writes. It appears writes are indeed cached judging from the time of 2.a) compared to UFS+directio. More specifically, 80MB/2s=40MB/s (UFS+directio) looks realistic while 80MB/0.11s=800MB/s (ZFS+primarycache=metadata) doesn't. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS
On Mon, Feb 7, 2011 at 1:51 PM, Brandon High wrote: > On Mon, Feb 7, 2011 at 10:29 AM, Yi Zhang wrote: >> I already set primarycache to metadata, and I'm not concerned about >> caching reads, but caching writes. It appears writes are indeed cached >> judging from the time of 2.a) compared to UFS+directio. More >> specifically, 80MB/2s=40MB/s (UFS+directio) looks realistic while >> 80MB/0.11s=800MB/s (ZFS+primarycache=metadata) doesn't. > > You're trying to force a solution that isn't relevant for the > situation. ZFS is not UFS, and solutions that are required for UFS to > work correctly are not needed with ZFS. > > Yes, writes are cached, but all the POSIX requirements for synchronous > IO are met by the ZIL. As long as your storage devices, be they SAN, > DAS or somewhere in between respect cache flushes, you're fine. If you > need more performance, use a slog device that respects cache flushes. > You don't need to worry about whether writes are being cached, because > any data that is written synchronously will be committed to stable > storage before the write returns. > > -B > > -- > Brandon High : bh...@freaks.com > Maybe I didn't make my intention clear. UFS with directio is reasonably close to a raw disk from my application's perspective: when the app writes to a file location, no buffering happens. My goal is to find a way to duplicate this on ZFS. Setting primarycache didn't eliminate the buffering, using O_DSYNC (whose side effects include elimination of buffering) made it ridiculously slow: none of the things I tried eliminated buffering, and just buffering, on ZFS. >From the discussion so far my feeling is that ZFS is too different from UFS that there's simply no way to achieve this goal... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS
On Mon, Feb 7, 2011 at 2:21 PM, Brandon High wrote: > On Mon, Feb 7, 2011 at 11:17 AM, Yi Zhang wrote: >> Maybe I didn't make my intention clear. UFS with directio is >> reasonably close to a raw disk from my application's perspective: when >> the app writes to a file location, no buffering happens. My goal is to >> find a way to duplicate this on ZFS. > > Step back an consider *why* you need no buffering. I'm writing an database-like application which manages its own page buffer, so I want to disable the buffering in the OS/FS level. UFS with directio suits my need perfectly, but I also want to try it on ZFS because ZFS does't directly overwrite a page which is being modified (it allocates a new page instead), and thus it represents a different category of FS. I want to measure the performance difference of my app on UFS and ZFS and tell how my app is FS-dependent. > >> From the discussion so far my feeling is that ZFS is too different >> from UFS that there's simply no way to achieve this goal... > > ZFS is not UFS, and solutions that are required for UFS to work > correctly are not needed with ZFS. > > -B > > -- > Brandon High : bh...@freaks.com > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS
On Mon, Feb 7, 2011 at 2:42 PM, Nico Williams wrote: > On Mon, Feb 7, 2011 at 1:17 PM, Yi Zhang wrote: >> On Mon, Feb 7, 2011 at 1:51 PM, Brandon High wrote: >> Maybe I didn't make my intention clear. UFS with directio is >> reasonably close to a raw disk from my application's perspective: when >> the app writes to a file location, no buffering happens. My goal is to >> find a way to duplicate this on ZFS. > > You're still mixing directio and O_DSYNC. > > O_DSYNC is like calling fsync(2) after every write(2). fsync(2) is > useful to obtain > some limited transactional semantics, as well as for durability > semantics. In ZFS > you don't need to call fsync(2) to get those transactional semantics, but you > do > need to call fsync(2) get those durability semantics. > > Now, in ZFS fsync(2) implies a synchronous I/O operation involving > significantly > more than just the data blocks you wrote to. Which means that O_DSYNC on ZFS > is significantly slower than on UFS. You can address this in one of two ways: > a) you might realize that you don't need every write(2) to be durable, then > stop > using O_DSYNC, b) you might get a fast ZIL device. > > I'm betting that if you look carefully at your application's requirements > you'll > probably conclude that you don't need O_DSYNC at all. Perhaps you can tell us > more about your application. > >> Setting primarycache didn't eliminate the buffering, using O_DSYNC >> (whose side effects include elimination of buffering) made it >> ridiculously slow: none of the things I tried eliminated buffering, >> and just buffering, on ZFS. >> >> From the discussion so far my feeling is that ZFS is too different >> from UFS that there's simply no way to achieve this goal... > > You've not really stated your application's requirements. You may be > convinced > that you need O_DSYNC, but chances are that you don't. And yes, it's possible > that you'd need O_DSYNC on UFS but not on ZFS. > > Nico > -- > Please see my previous email for a high-level discussion of my application. I know that I don't really need O_DSYNC. The reason why I tried that is to get the side effect of no buffering, which is my ultimate goal. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS
On Mon, Feb 7, 2011 at 2:54 PM, Nico Williams wrote: > On Mon, Feb 7, 2011 at 1:49 PM, Yi Zhang wrote: >> Please see my previous email for a high-level discussion of my >> application. I know that I don't really need O_DSYNC. The reason why I >> tried that is to get the side effect of no buffering, which is my >> ultimate goal. > > ZFS cannot not buffer. The reason is that ZFS likes to batch transactions > into > as large a contiguous write to disk as possible. The ZIL exists to > support fsyn(2) > operations that must commit before the rest of a ZFS transaction. In > other words: > there's always some amount of buffering of writes in ZFS. In that case, ZFS doesn't suit my needs. > > As to read buffering, why would you want to disable those? My application manages its own buffer and reads/writes go through that buffer first. I don't want double buffering. > > You still haven't told us what your application does. Or why you want > to get close > to the metal. Simply telling us that you need "no buffering" doesn't > really help us > help you -- with that approach you'll simply end up believing that ZFS is not > appropriate for your needs, even though it well might be. It's like the Berkeley DB on a high level, though it doesn't require transaction support, durability, etc. I'm measuring its performance and don't want FS buffer to pollute my results (hence directio). > > Nico > -- > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS
On Mon, Feb 7, 2011 at 3:14 PM, Bill Sommerfeld wrote: > On 02/07/11 11:49, Yi Zhang wrote: >> >> The reason why I >> tried that is to get the side effect of no buffering, which is my >> ultimate goal. > > ultimate = "final". you must have a goal beyond the elimination of > buffering in the filesystem. > > if the writes are made durable by zfs when you need them to be durable, why > does it matter that it may buffer data while it is doing so? > > - Bill If buffering is on, the running time of my app doesn't reflect the actual I/O cost. My goal is to accurately measure the time of I/O. With buffering on, ZFS would batch up a bunch of writes and change both the original I/O activity and the time. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS
On Mon, Feb 7, 2011 at 3:47 PM, Nico Williams wrote: > On Mon, Feb 7, 2011 at 2:39 PM, Yi Zhang wrote: >> On Mon, Feb 7, 2011 at 2:54 PM, Nico Williams wrote: >>> ZFS cannot not buffer. The reason is that ZFS likes to batch transactions >>> into >>> as large a contiguous write to disk as possible. The ZIL exists to >>> support fsyn(2) >>> operations that must commit before the rest of a ZFS transaction. In >>> other words: >>> there's always some amount of buffering of writes in ZFS. >> In that case, ZFS doesn't suit my needs. > > Maybe. See below. > >>> As to read buffering, why would you want to disable those? >> My application manages its own buffer and reads/writes go through that >> buffer first. I don't want double buffering. > > So your concern is that you don't want to pay twice the memory cost > for buffering? > > If so, set primarycache as described earlier and drop the O_DSYNC flag. > > ZFS will then buffer your writes, but only for a little while, and you > should want it to > because ZFS will almost certainly do a better job of batching transactions > than > your application would. With ZFS you'll benefit from: advanced volume > management, > snapshots/clones, dedup, Merkle hash trees (i.e., corruption > detection), encryption, > and so on. You'll almost certainly not be implementing any of those > in your application... > >>> You still haven't told us what your application does. Or why you want >>> to get close >>> to the metal. Simply telling us that you need "no buffering" doesn't >>> really help us >>> help you -- with that approach you'll simply end up believing that ZFS is >>> not >>> appropriate for your needs, even though it well might be. >> It's like the Berkeley DB on a high level, though it doesn't require >> transaction support, durability, etc. I'm measuring its performance >> and don't want FS buffer to pollute my results (hence directio). > > You're still mixing directio and O_DSYNC. > > You should do three things: a) set primarycache=metadata, b) set recordsize to > whatever your application's page size is (e.g., 8KB), c) stop using O_DSYNC. > > Tell us how that goes. I suspect the performance will be much better. > > Nico > -- > This is actually what I did for 2.a) in my original post. My concern there is that ZFS' internal write buffering makes it hard to get a grip on my application's behavior. I want to present my application's "raw" I/O performance without too much outside factors... UFS plus directio gives me exactly (or close to) that but ZFS doesn't... Of course, in the final deployment, it would be great to be able to take advantage of ZFS' advanced features such as I/O optimization. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss