Re: [zfs-discuss] Re: Poor performance on NFS-exported ZFS volumes
On Fri, Jul 28, 2006 at 02:02:13PM -0700, Richard Elling wrote: > Joseph Mocker wrote: > >Richard Elling wrote: > >>The problem is that there are at least 3 knobs to turn (space, RAS, and > >>performance) and they all interact with each other. > > > >Good point. then how about something more like > > > > zpool bench raidz favor space disk1 ... diskN > > zpool bench raidz favor performance disk1 .. diskN > > > >That is, tell the analyzer which knob you are most interested in. > > I wish it was that easy. If I optimize for space, I'll always get a big, > fat RAID-0. If I optimize for RAS, I'll get multi paned (N-way) mirror. > The tool has to be able to handle the spectrum in between those extremes. Look closer at the format of that command: zpool bench *RAIDZ* blah RAID-0 isn't an option, find the best parameters for what is specified, in this case we are constrained to RAIDZ. -brian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 3510 JBOD ZFS vs 3510 HW RAID
Torrey, On 7/28/06 10:11 AM, "Torrey McMahon" <[EMAIL PROTECTED]> wrote: > That said a 3510 with a raid controller is going to blow the door, drive > brackets, and skin off a JBOD in raw performance. I'm pretty certain this is not the case. If you need sequential bandwidth, each 3510 only brings 200MB/s x two Fibre channel attach = 400MB/s total. Cheap internal disks in the X4500 reach 2,000MB/s and 2500 random seeks/second using ZFS. General purpose CPUs have reached high enough speeds that they blow cheap RAID CPUs away with good software RAID. - Luke ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Flushing synchronous writes to mirrors
Jeff Bonwick wrote: >> For a synchronous write to a pool with mirrored disks, does the write >> unblock after just one of the disks' write caches is flushed, >> or only after all of the disks' caches are flushed? > The latter. We don't consider a write to be committed until > the data is on stable storage at full replication. [snip] That makes sense, but there's a point at which ZFS must abandon this strategy; otherwise, the malfunction of one disk in a 3-way mirror could halt the entire system, when what's probably desired is for the system to keep running in degraded mode with only 2 remaining functional disks in the mirror. But then of course there would be the problem of divergent disks in a mirror; suppose there's a system with one pool on a pair of mirrored disks, and system root is on that pool. The disks are external, with interface cables running across the room. The system is running fine until my dog trips over the cable for disk #2. Down goes disk #2, and the system continues running fine, with a degraded pool, and during operation continues modifying various files. Later, the dog chews through the cable for disk #1. Down goes the system. I don't have a spare cable, so I just plug in disk #2, and restart the system. The system continues running fine, with a degraded pool, and during operation continues modifying various files. I go to the store to buy a new cable for disk #1, and when I come back, I trip over the cable for disk #2. Down goes the system. I plug #2 back in, replace the cable for #1, and restart the system. At this point, the system comes up with its root on a pool with divergent mirrors, and... ? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Flushing synchronous writes to mirrors
Andrew wrote: Jeff Bonwick wrote: For a synchronous write to a pool with mirrored disks, does the write unblock after just one of the disks' write caches is flushed, or only after all of the disks' caches are flushed? The latter. We don't consider a write to be committed until the data is on stable storage at full replication. [snip] That makes sense, but there's a point at which ZFS must abandon this strategy; otherwise, the malfunction of one disk in a 3-way mirror could halt the entire system, when what's probably desired is for the system to keep running in degraded mode with only 2 remaining functional disks in the mirror. But then of course there would be the problem of divergent disks in a mirror; suppose there's a system with one pool on a pair of mirrored disks, and system root is on that pool. The disks are external, with interface cables running across the room. The system is running fine until my dog trips over the cable for disk #2. Down goes disk #2, and the system continues running fine, with a degraded pool, and during operation continues modifying various files. Later, the dog chews through the cable for disk #1. Down goes the system. I don't have a spare cable, so I just plug in disk #2, and restart the system. The system continues running fine, with a degraded pool, and during operation continues modifying various files. I go to the store to buy a new cable for disk #1, and when I come back, I trip over the cable for disk #2. Down goes the system. I plug #2 back in, replace the cable for #1, and restart the system. At this point, the system comes up with its root on a pool with divergent mirrors, and... ? Wow, you must be the unluckiest person ever! And such a strong dog. So when the system comes back up, your uberblock and your ditto blocks will be examined, and those which have incorrect checksums will be detected and fixed. James C. McPherson -- Solaris Datapath Engineering Storage Division Sun Microsystems ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Poor performance on NFS-exported ZFS volumes
Brian Hechinger wrote: On Fri, Jul 28, 2006 at 02:02:13PM -0700, Richard Elling wrote: Joseph Mocker wrote: Richard Elling wrote: The problem is that there are at least 3 knobs to turn (space, RAS, and performance) and they all interact with each other. Good point. then how about something more like zpool bench raidz favor space disk1 ... diskN zpool bench raidz favor performance disk1 .. diskN That is, tell the analyzer which knob you are most interested in. I wish it was that easy. If I optimize for space, I'll always get a big, fat RAID-0. If I optimize for RAS, I'll get multi paned (N-way) mirror. The tool has to be able to handle the spectrum in between those extremes. Look closer at the format of that command: zpool bench *RAIDZ* blah RAID-0 isn't an option, find the best parameters for what is specified, in this case we are constrained to RAIDZ. Given N disks, what is the best raidz configuration is also not a trivial question to answer. The spectrum is narrowed to something between N-way and (int)N/3 3-way plus at least one spare. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs mount stuck in zil_replay
Hello ZFS, System was rebooted and after reboot server again System is snv_39, SPARC, T2000 bash-3.00# ptree 7 /lib/svc/bin/svc.startd -s 163 /sbin/sh /lib/svc/method/fs-local 254 /usr/sbin/zfs mount -a [...] bash-3.00# zfs list|wc -l 46 Using df I can see most file systems are already mounted. > ::ps!grep zfs R254163 7 7 0 0x4a004000 0600219e1800 zfs > 0600219e1800::walk thread|::findstack -v stack pointer for thread 300013026a0: 2a10069ebb1 [ 02a10069ebb1 cv_wait+0x40() ] 02a10069ec61 txg_wait_synced+0x54(352f0d0, 2f2216d, 3000107fb90, 352f110, 352f112, 352f0c8) 02a10069ed11 zil_replay+0xbc(60022609c08, 600226a1840, 600226a1870, 700d13d8, 7ba2d000, 60006821200) 02a10069ee01 zfs_domount+0x1f0(7ba35000, 700d1000, 60021a8bae0, 1, 0, 0) 02a10069eed1 zfs_mount+0x160(600225f6080, 1868800, 2a10069f9d8, 60021a8bae0, 2f, 0) 02a10069efb1 domount+0x9b0(2a10069f8b0, 1, 600225f6080, 0, 0, 0) 02a10069f121 mount+0x110(600013fbcf8, 2a10069fad8, 0, 0, ff38e40c, 100) 02a10069f221 syscall_ap+0x44(2a0, ffbfeca0, 1118bf4, 600013fbc40, 15, 0) 02a10069f2e1 syscall_trap32+0xcc(52b3c8, ffbfeca0, 100, ff38e40c, 0, 0) > # iostat -xnz 1 extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 3.0 513.0 192.0 2822.1 0.0 2.10.04.0 0 95 c4t600C0FF009258F3E4C4C5601d0 extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 7.0 598.1 388.6 1832.9 0.0 2.00.03.4 0 93 c4t600C0FF009258F3E4C4C5601d0 ^C bash-3.00# However for many seconds no IOs at all are issued. # mpstat 1 [...] CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 00 0 74 209 10900000 00 1 0 99 10 00 1000000 00 0 0 100 20 00 1000000 00 0 0 100 30 00 1000000 00 100 0 0 40 00 1000000 00 0 0 100 50 00 6180000 00 0 0 100 60 00 1000000 00 0 0 100 70 00 1000000 00 0 0 100 83 01 70 120020 2451 1 0 98 90 00 1000000 00 0 0 100 100 00 1000000 00 0 0 100 110 00 1000000 00 0 0 100 120 00 1000000 00 0 0 100 130 00 60 100000 00 0 0 100 140 00 3021000 1880 1 0 99 150 00 1000000 00 0 0 100 160 00 81 120000 00 0 0 100 170 00 70 120000 00 0 0 100 180 00230 430010 00 0 0 100 190 00 90 160000 00 0 0 100 200 00 80 140000 00 0 0 100 210 00 7280100 10 1 0 99 220 0 1738 3470110 50 0 0 100 230 00 70 120000 00 0 0 100 240 00 4060100 00 0 0 100 250 01 5080100 00 0 0 100 260 00 2020000 00 0 0 100 270 00 2020000 00 0 0 100 280 00 2020010 00 0 0 100 290 00 1000000 00 0 0 100 300 00 1000000 00 0 0 100 310 00 6080000 00 0 0 100 ^C bash-3.00# # dtrace -n fbt:::entry'{self->t=timestamp;}' -n fbt:::return'/self->t/[EMAIL PROTECTED](timestamp-self->t);self->t=0;}' -n tick-5s'{printa(@);exit(0);}' [...] syscall_mstate 12636328 callout_schedule_1 14428108 hwblkclr 16656308 avl_rotation 17196252 page_pptonum 19457456 sfmmu_mlist_enter 20078508 sfmmu_mlist_exit 20804176 page_
[zfs-discuss] can't import zpool now that drive is in an external USB enclosure
hi all, i recently replaced the drive in my ferrari 4000 with a 7200rpm drive and i put the original drive in a silvestone USB enclosure. when i plug it vold puts the icon on the desktop and i can see the root UFS filesystem, but i can't import the zpool that held all my user data. ;( i found these websites by searching http://blogs.sun.com/roller/page/artem?entry=zfs_on_the_go http://www.opensolaris.org/jive/thread.jspa?messageID=48510뵾 and i tried disabling vold and adding the line to scsa2usb.conf, but format still won't display the disk unless i do a 'format -e' and i can't import the pool. any ideas as to what i should try next? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Canary is now running latest code and has a 3 disk raidz ZFS volume
Hi George; life is better for us now. we upgraded to s10s_u3wos_01 last Friday on itsm-mpk-2.sfbay , the production Canary server http://canary.sfbay. What do we look like now? # zpool upgrade This system is currently running ZFS version 2. All pools are formatted using this version. we added two more lower performance disk drives last Friday. we went from two drives that were mirrored to four drives. now we look like this on our T2000: (1) 68 gig running unmirrored for the system (3) 68 gig drives setup as raidz # zpool status pool: canary state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM canary ONLINE 0 0 0 raidz ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 errors: No known data errors our 100% disk drive from previous weeks is now three drives. iostat now shows that no single drive is reaching 100% . here is a "iostat -xn 1 99" extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 4.0 0.0 136.0 0.0 0.0 0.0 0.0 5.3 0 2 c1t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0 0.0 288.9 0.0 939.3 0.0 7.0 0.0 24.1 1 74 c1t1d0 0.0 300.9 0.0 940.8 0.0 6.2 0.0 20.7 1 72 c1t2d0 0.0 323.9 0.0 927.8 0.0 5.3 0.0 16.5 1 63 c1t3d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 itsm-mpk-2:vold(pid334) extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c1t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0 0.0 70.9 0.0 118.8 0.0 0.5 0.0 7.6 0 28 c1t1d0 0.0 74.9 0.0 124.3 0.0 0.5 0.0 6.1 0 26 c1t2d0 0.0 75.8 0.0 120.3 0.0 0.5 0.0 7.2 0 27 c1t3d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 itsm-mpk-2:vold(pid Here is our old box # more /etc/release Solaris 10 6/06 s10s_u2wos_06 SPARC Copyright 2006 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 30 March 2006 # pkginfo -l SUNWzfsr PKGINST: SUNWzfsr NAME: ZFS (Root) CATEGORY: system ARCH: sparc VERSION: 11.10.0,REV=2006.03.22.02.15 BASEDIR: / VENDOR: Sun Microsystems, Inc. DESC: ZFS root components PSTAMP: on10-patch20060322021857 INSTDATE: Apr 04 2006 13:52 HOTLINE: Please contact your local service provider STATUS: completely installed FILES: 18 installed pathnames 5 shared pathnames 7 directories 4 executables 1811 blocks used (approx) here is the current version # more /etc/release Solaris 10 11/06 s10s_u3wos_01 SPARC Copyright 2006 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 27 June 2006 # pkginfo -l SUNWzfsr PKGINST: SUNWzfsr NAME: ZFS (Root) CATEGORY: system ARCH: sparc VERSION: 11.10.0,REV=2006.05.18.02.15 BASEDIR: / VENDOR: Sun Microsystems, Inc. DESC: ZFS root components PSTAMP: on10-patch20060315140831 INSTDATE: Jul 27 2006 12:10 HOTLINE: Please contact your local service provider STATUS: completely installed FILES: 18 installed pathnames 5 shared pathnames 7 directories 4 executables 1831 blocks used (approx) In my opinion the 2 1/2" disk drives in the Niagara box were not designed to receive one million files per day. these two extra drives (thanks Denis!) have given us acceptable performance. i still want a thumper *smile*. It is pretty amazing that we have 800 servers, 30,000 users, 140 million lines of ASCII per day all fitting in a 2u T2000 box! thanks sean George Wilson wrote: Sean, Sorry for the delay getting back to you. You can do a 'zpool upgrade' to see what version of the on-disk format you pool is currently running. The latest version is 3. You can then issue a 'zpool upgrade ' to upgrade. Keep in mind that the upgrade is a one-way ticket and can't be rolled backwards. ZFS can be upgraded by just applying patches. So if you were running Solaris 10 06/06 (a.k.a u2) you could apply the patches that will come out when u3 ships. Then issue the 'zpool upgrade' command to get the functionality you need. Does this help? Can you send me the output of 'zpool upgrade' on your system? Thanks, George Sean Meighan wrote: Hi George; we are trying to build our se
[zfs-discuss] Re: Canary is now running latest code and has a 3 disk raidz ZFS volume
Sean, This is looking better! Once you get to the latest ZFS changes that we just putback into s10 you will be able to upgrade to ZFS version 3 which will provide such key features as Hot spares, RAID-6, clone promotion, and fast snapshots. Additionally, there are more performance gains that will probably help you out. Thanks, George Sean Meighan wrote: *Hi George; life is better for us now. we upgraded to s10s_u3wos_01 last Friday on itsm-mpk-2.sfbay , the production Canary server http://canary.sfbay. What do we look like now? * # zpool upgrade This system is currently running ZFS version 2. All pools are formatted using this version. we added two more lower performance disk drives last Friday. we went from two drives that were mirrored to four drives. now we look like this on our T2000: (1) 68 gig running unmirrored for the system (3) 68 gig drives setup as raidz # zpool status pool: canary state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM canary ONLINE 0 0 0 raidz ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 errors: No known data errors our 100% disk drive from previous weeks is now three drives. iostat now shows that no single drive is reaching 100% . here is a "iostat -xn 1 99" extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 4.00.0 136.00.0 0.0 0.00.05.3 0 2 c1t0d0 0.00.00.00.0 0.0 0.00.00.0 0 0 c0t0d0 0.0 288.90.0 939.3 0.0 7.00.0 24.1 1 74 c1t1d0 0.0 300.90.0 940.8 0.0 6.20.0 20.7 1 72 c1t2d0 0.0 323.90.0 927.8 0.0 5.30.0 16.5 1 63 c1t3d0 0.00.00.00.0 0.0 0.00.00.0 0 0 itsm-mpk-2:vold(pid334) extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.00.00.00.0 0.0 0.00.00.0 0 0 c1t0d0 0.00.00.00.0 0.0 0.00.00.0 0 0 c0t0d0 0.0 70.90.0 118.8 0.0 0.50.07.6 0 28 c1t1d0 0.0 74.90.0 124.3 0.0 0.50.06.1 0 26 c1t2d0 0.0 75.80.0 120.3 0.0 0.50.07.2 0 27 c1t3d0 0.00.00.00.0 0.0 0.00.00.0 0 0 itsm-mpk-2:vold(pid Here is our old box # more /etc/release Solaris 10 6/06 s10s_u2wos_06 SPARC Copyright 2006 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 30 March 2006 # pkginfo -l SUNWzfsr PKGINST: SUNWzfsr NAME: ZFS (Root) CATEGORY: system ARCH: sparc VERSION: 11.10.0,REV=2006.03.22.02.15 BASEDIR: / VENDOR: Sun Microsystems, Inc. DESC: ZFS root components PSTAMP: on10-patch20060322021857 INSTDATE: Apr 04 2006 13:52 HOTLINE: Please contact your local service provider STATUS: completely installed FILES: 18 installed pathnames 5 shared pathnames 7 directories 4 executables 1811 blocks used (approx) here is the current version # more /etc/release Solaris 10 11/06 s10s_u3wos_01 SPARC Copyright 2006 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 27 June 2006 # pkginfo -l SUNWzfsr PKGINST: SUNWzfsr NAME: ZFS (Root) CATEGORY: system ARCH: sparc VERSION: 11.10.0,REV=2006.05.18.02.15 BASEDIR: / VENDOR: Sun Microsystems, Inc. DESC: ZFS root components PSTAMP: on10-patch20060315140831 INSTDATE: Jul 27 2006 12:10 HOTLINE: Please contact your local service provider STATUS: completely installed FILES: 18 installed pathnames 5 shared pathnames 7 directories 4 executables 1831 blocks used (approx) In my opinion the 2 1/2" disk drives in the Niagara box were not designed to receive one million files per day. these two extra drives (thanks Denis!) have given us acceptable performance. i still want a thumper *smile*. It is pretty amazing that we have 800 servers, 30,000 users, 140 million lines of ASCII per day all fitting in a 2u T2000 box! thanks sean George Wilson wrote: Sean, Sorry for the delay getting back to you. You can do a 'zpool upgrade' to see what version of the on-disk format you pool is currently running. The latest version is 3. You can then issue a 'zpool upgrade ' to upgrade. Keep in mind that the upgrade is a one-way ticket and can't be rolled backwards. ZFS can be upgraded by just applying patch
[zfs-discuss] zfs vs. vxfs
Hi Folks, Is any one have a comparison between zfs vs. vxfs, I'm working on a presentation for my management on this --- thanks in advance, Malahat Qureshi Ph.D. (MIS)Email: [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] LSI RAID Configuration Steps on T2000
Guru's, Is anyone shar with me the steps to configure hardware RAID in T2000 server (LSI drivers) and use rootdisk hardware mirror -- Thanks a lot! Malahat Qureshi Ph.D. (MIS)Email: [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LSI RAID Configuration Steps on T2000
Malahat Qureshi wrote: Is anyone shar with me the steps to configure hardware RAID in T2000 server (LSI drivers) and use rootdisk hardware mirror -- Hi Malahat, please view and follow the documentation: http://docs.sun.com/source/819-3249-11/erie-volume-man.html http://docs.sun.com/app/docs/doc/819-2240/6n4htdnh7?q=raidctl&a=view James C. McPherson -- Solaris Datapath Engineering Storage Division Sun Microsystems ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 6424554
Robert, We are looking to try to get patches out by late September which will include this and many other fixes. I'll post all the changes in another thread. Thanks, George Robert Milkowski wrote: Hello Fred, Friday, July 28, 2006, 12:37:22 AM, you wrote: FZ> Hi Robert, FZ> The fix for 6424554 is being backported to S10 and will be available in FZ> S10U3, later this year. I know that already - I was rather asking if a patch containing the fix will be available BEFORE U3 and if yes then when? IMHO many more people I evaluating ZFS now when it's in a stable Solaris release. Any performance fixes to ZFS should be available as soon as possible 'coz that's one of the things people are looking at and once they will be disappointed it will take long time for them to try again. Anyway I will try to get IDR via support channels. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss