Re: [zfs-discuss] panic with zfs
Am 24.1.2007 15:49 Uhr, Michael Schuster schrieb: >> I am going to create the same conditions here but with snv_55b and >> then yank >> a disk from my zpool. If I get a similar response then I will *hope* >> for a >> crash dump. >> >> You must be kidding about the "open a case" however. This is >> OpenSolaris. > > no, I'm not. That's why I said "If you have a supported version of > Solaris". Also, Ihsan seems to disagree about OpenSolaris: I opened a case this morning. Lets see, what the support guys are saying. Ihsan -- [EMAIL PROTECTED] http://ihsan.dogan.ch/ http://gallery.dogan.ch/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re: ZFS or UFS - what to do?
Anantha N. Srirama writes: > Agreed, I guess I didn't articulate my point/thought very well. The > best config is to present JBoDs and let ZFS provide the data > protection. This has been a very stimulating conversation thread; it > is shedding new light into how to best use ZFS. > > I would say: To enable the unique ZFS feature of self-healing ZFS must be allowed to manage a level of redundancy: mirroring or Raid-z. The type of LUNs (JBOD/Raid-*/iscsi) used is not relevant in this statement. Now, if one also relies on ZFS to reconstruct data in the face of disk failures (as opposed tostorage based reconstruction), better make sure that single/double disk failures do not bring down multiple LUNS at once. So better protection is achieved by configuring LUNS that maps to seggregated sets of physical things (disks & controllers). -r > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ftruncate is failing on ZFS
Hi All, In my test set up, I have one zpool of size 1000M bytes and it has only 30 M free space (970 M is used for some other purpose). On this zpool I created one file (using open () call) and i attempted to write 2MB data on it ( with write() call) but it is failed. It written only 1.3 MB (the written value of write() call) data, it is because of "No space left on the device". After that I tried to truncate this file to 1.3 Mb data but it is failing. Any clues on this? -Masthan - Food fight? Enjoy some healthy debate in the Yahoo! Answers Food & Drink Q&A.___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: can I use zfs on just a partition?
On Sun, Jan 28, 2007 at 01:53:04PM +0100, [EMAIL PROTECTED] wrote: > > >is this tuneable somehow/somewhere? can i enabyle writecache if only using a > >dedicated partition ? > > If does put the additional data at some what of a risk; not really > for swap but perhaps not nice for UFS. How about two partitions used in two different ZPOOLs? once ZFS boot comes along, I'm sure RAIDZ(2) won't be supported as a boot device? If that's the case, I wouldn't mind splitting the disks into a mirrored OS portion and a RAIDZ data portion (think system with 3 or 4 disks). -brian -- "The reason I don't use Gnome: every single other window manager I know of is very powerfully extensible, where you can switch actions to different mouse buttons. Guess which one is not, because it might confuse the poor users? Here's a hint: it's not the small and fast one."--Linus ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] panic with zfs
Ihsan, If you are running Solaris 10 then you are probably hitting: 6456939 sd_send_scsi_SYNCHRONIZE_CACHE_biodone() can issue TUR which calls biowait()and deadlock/hangs host This was fixed in opensolaris (build 48) but a patch is not yet available for Solaris 10. Thanks, George Ihsan Dogan wrote: Am 24.1.2007 15:49 Uhr, Michael Schuster schrieb: I am going to create the same conditions here but with snv_55b and then yank a disk from my zpool. If I get a similar response then I will *hope* for a crash dump. You must be kidding about the "open a case" however. This is OpenSolaris. no, I'm not. That's why I said "If you have a supported version of Solaris". Also, Ihsan seems to disagree about OpenSolaris: I opened a case this morning. Lets see, what the support guys are saying. Ihsan ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS or UFS - what to do?
> > Our Netapp does double-parity RAID. In fact, the filesystem design is > > remarkably similar to that of ZFS. Wouldn't that also detect the > > error? I suppose it depends if the `wrong sector without notice' > > error is repeated each time. Or is it random? > > On most (all?) other systems the parity only comes into effect when a > drive fails. When all the drives are reporting "OK" most (all?) RAID > systems don't use the parity data at all. ZFS is the first (only?) > system that actively checks the data returned from disk, regardless > of whether the drives are reporting they're okay or not. > > I'm sure I'll be corrected if I'm wrong. :) Netapp/OnTAP does do read verification, but it does it outside the raid-4/raid-dp protection (just like ZFS does it outside the raidz protction). So it's correct that the parity data is not read at all in either OnTAP or ZFS, but both attempt to do verification of the data on all reads. See also: http://blogs.sun.com/bonwick/entry/zfs_end_to_end_data for a few more specifics on it and the differences from the ZFS data check. -- Darren Dunham [EMAIL PROTECTED] Senior Technical Consultant TAOShttp://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Adding my own compression to zfs
Have a look at: http://blogs.sun.com/ahl/entry/a_little_zfs_hack On 27/01/07, roland <[EMAIL PROTECTED]> wrote: is it planned to add some other compression algorithm to zfs ? lzjb is quite good and especially performing very well, but i`d like to have better compression (bzip2?) - no matter how worse performance drops with this. regards roland This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] data wanted: disk kstats
Robert Milkowski wrote: Hello Richard, Friday, January 26, 2007, 11:36:07 PM, you wrote: RE> We've been talking a lot recently about failure rates and types of RE> failures. As you may know, I do look at field data and generally don't RE> ask the group for more data. But this time, for various reasons (I RE> might have found a bug or deficiency) I'm soliciting for more data at RE> large. RE> What I'd like to gather is the error rates per bytes transferred. This RE> data is collected in kstats, but is reset when you reboot. One of the RE> features of my vast collection of field data is that it is often collected RE> rather soon after a reboot. Thus, there aren't very many bytes transferred RE> yet, and the corresponding error rates tend to be small (often 0). A perfect RE> collection would be from a machine connected to lots of busy disks which RE> has been up for a very long time. RE> Can you help? It is real simple. Just email me the output of: I've sent you off list. Thanks. Will those results (total statistics, not site specific) be publicly provided by you (here?)? Sure. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Adding my own compression to zfs
See the following bug: http://bugs.opensolaris.org/view_bug.do?bug_id=6280662 Cindy roland wrote: is it planned to add some other compression algorithm to zfs ? lzjb is quite good and especially performing very well, but i`d like to have better compression (bzip2?) - no matter how worse performance drops with this. regards roland This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS or UFS - what to do?
On Jan 26, 2007, at 09:16, Jeffery Malloch wrote: Hi Folks, I am currently in the midst of setting up a completely new file server using a pretty well loaded Sun T2000 (8x1GHz, 16GB RAM) connected to an Engenio 6994 product (I work for LSI Logic so Engenio is a no brainer). I have configured a couple of zpools from Volume groups on the Engenio box - 1x2.5TB and 1x3.75TB. I then created sub zfs systems below that and set quotas and sharenfs'd them so that it appears that these "file systems" are dynamically shrinkable and growable. ah - the 6994 is the controller we use in the 6140/6540 if i'm not mistaken .. i guess this thread will go down in a flaming JBOD vs RAID controller religious war again .. oops, too late :P yes - the dynamic LUN expansion bits in ZFS is quite nice and handy for managing dynamic growth of a pool or file system. so going back to Jeffery's original questions: 1. How stable is ZFS? The Engenio box is completely configured for RAID5 with hot spares and write cache (8GB) has battery backup so I'm not too concerned from a hardware side. I'm looking for an idea of how stable ZFS itself is in terms of corruptability, uptime and OS stability. I think the stability issue has already been answered pretty well .. 8GB battery backed cache is nice .. performance wise you might find some odd interactions with the ZFS adaptive cache integration and the way in which the intent log operates (O_DSYNC writes can potentially impose a lot of in flight commands for relatively little work) - there's a max blocksize of 128KB (also maxphys), so you might want to experiment with tuning back the stripe width .. i seem to recall the the 6994 controller seemed to perform best with 256KB or 512KB stripe width .. so there may be additional tuning on the read-ahead or write- behind algorithms. 2. Recommended config. Above, I have a fairly simple setup. In many of the examples the granularity is home directory level and when you have many many users that could get to be a bit of a nightmare administratively. I am really only looking for high level dynamic size adjustability and am not interested in its built in RAID features. But given that, any real world recommendations? Not being interested in the RAID functionality as Roch points out eliminates the self-healing functionality and reconstruction bits in ZFS .. but you still get other nice benefits like dynamic LUN expansion As i see it, since we seem to have excess CPU and bus capacity on newer systems (most applications haven't quite caught up to impose enough of a load yet) .. we're back to the mid '90s where host based volume management and caching makes sense and is being proposed again. Being proactive, we might want to consider putting an embedded Solaris/ZFS on a RAID controller to see if we've really got something novel in the caching and RAID algorithms for when the application load really does catch up and impose more of a load on the host. Additionally - we're seeing that there's a big benefit in moving the filesystem closer to the storage array since most users care more about their consistency of their data (upper level) than the reliability of the disk subsystem or RAID controller. Implementing a RAID controller that's more intimately aware of the upper data levels seems like the next logical evolutionary step. 3. Caveats? Anything I'm missing that isn't in the docs that could turn into a BIG gotchya? I would say be careful of the ease at which you can destroy file systems and pools .. while convenient - there's typically no warning if you or an administrator does a zfs or zpool destroy .. so i could see that turning into an issue. Also if a LUN goes offline, you may not see this right away and you would have the potential to corrupt your pool or panic your system. Hence the self-healing and scrub options to detect and repair failure a little bit faster. People on this forum have been finding RAID controller inconsistencies .. hence the religious JBOD vs RAID ctlr "disruptive paradigm shift" 4. Since all data access is via NFS we are concerned that 32 bit systems (Mainly Linux and Windows via Samba) will not be able to access all the data areas of a 2TB+ zpool even if the zfs quota on a particular share is less then that. Can anyone comment? Doing 2TB+ shouldn't be a problem for the NFS or Samba mounted filesystem regardless if the host is 32bit or not. The only place where you can run into a problem is if the size of an individual file crosses 2 or 4TB on a 32bit system. I know we've implemented file systems (QFS in this case) that were samba shared to 32bit windows hosts in excess of 40-100TB without any major issues. I'm sure there's similar cases with ZFS and thumper .. i just don't have that data. a little late to the discussion, but hth --- .je ___
[zfs-discuss] Re: ZFS or UFS - what to do?
Hi Guys, SO... >From what I can tell from this thread ZFS if VERY fussy about managing >writes,reads and failures. It wants to be bit perfect. So if you use the >hardware that comes with a given solution (in my case an Engenio 6994) to >manage failures you risk a) bad writes that don't get picked up due to >corruption from write cache to disk b) failures due to data changes that ZFS >is unaware of that the hardware imposes when it tries to fix itself. So now I have a $70K+ lump that's useless for what it was designed for. I should have spent $20K on a JBOD. But since I didn't do that, it sounds like a traditional model works best (ie. UFS et al) for the type of hardware I have. No sense paying for something and not using it. And by using ZFS just as a method for ease of file system growth and management I risk much more corruption. The other thing I haven't heard is why NOT to use ZFS. Or people who don't like it for some reason or another. Comments? Thanks, Jeff PS - the responses so far have been great and are much appreciated! Keep 'em coming... This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS or UFS - what to do?
Hi Jeff, Maybe I mis-read this thread, but I don't think anyone was saying that using ZFS on-top of an intelligent array risks more corruption. Given my experience, I wouldn't run ZFS without some level of redundancy, since it will panic your kernel in a RAID-0 scenario where it detects a LUN is missing and can't fix it. That being said, I wouldn't run anything but ZFS anymore. When we had some database corruption issues awhile back, ZFS made it very simple to prove it was the DB. Just did a scrub and boom, verification that the data was laid down correctly. RAID-5 will have better random read performance the RAID-Z for reasons Robert had to beat into my head. ;-) But if you really need that performance, perhaps RAID-10 is what you should be looking at? Someone smarter than I can probably give a better idea. Regarding the failure detection, is anyone on the list have the ZFS/FMA traps fed into a network management app yet? I'm curious what the experience with it is? Best Regards, Jason On 1/29/07, Jeffery Malloch <[EMAIL PROTECTED]> wrote: Hi Guys, SO... >From what I can tell from this thread ZFS if VERY fussy about managing writes,reads and failures. It wants to be bit perfect. So if you use the hardware that comes with a given solution (in my case an Engenio 6994) to manage failures you risk a) bad writes that don't get picked up due to corruption from write cache to disk b) failures due to data changes that ZFS is unaware of that the hardware imposes when it tries to fix itself. So now I have a $70K+ lump that's useless for what it was designed for. I should have spent $20K on a JBOD. But since I didn't do that, it sounds like a traditional model works best (ie. UFS et al) for the type of hardware I have. No sense paying for something and not using it. And by using ZFS just as a method for ease of file system growth and management I risk much more corruption. The other thing I haven't heard is why NOT to use ZFS. Or people who don't like it for some reason or another. Comments? Thanks, Jeff PS - the responses so far have been great and are much appreciated! Keep 'em coming... This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Project Proposal: Availability Suite
Thank you for the detailed explanation. It is very helpful to understand the issue. Is anyone successfully using SNDR with ZFS yet? Best Regards, Jason On 1/26/07, Jim Dunham <[EMAIL PROTECTED]> wrote: Jason J. W. Williams wrote: > Could the replication engine eventually be integrated more tightly > with ZFS? Not it in the present form. The architecture and implementation of Availability Suite is driven off block-based replication at the device level (/dev/rdsk/...), something that allows the product to replicate any Solaris file system, database, etc., without any knowledge of what it is actually replicating. To pursue ZFS replication in the manner of Availability Suite, one needs to see what replication looks like from an abstract point of view. So simplistically, remote replication is like the letter 'h', where the left side of the letter is the complete I/O path on the primary node, the horizontal part of the letter is the remote replication network link, and the right side of the letter is only the bottom half of the complete I/O path on the secondary node. Next ZFS would have to have its functional I/O path split into two halves, a top and bottom piece. Next we configure replication, the letter 'h', between two given nodes, running both a top and bottom piece of ZFS on the source node, and just the bottom half of ZFS on the secondary node. Today, the SNDR component of Availability Suite works like the letter 'h' today, where we split the Solaris I/O stack into a top and bottom half. The top half is that software (file system, database or application I/O) that directs its I/Os to the bottom half (raw device, volume manager or block device). So all that needs to be done is to design and build a new variant of the letter 'h', and find the place to separate ZFS into two pieces. - Jim Dunham > > That would be slick alternative to send/recv. > > Best Regards, > Jason > > On 1/26/07, Jim Dunham <[EMAIL PROTECTED]> wrote: >> Project Overview: >> >> I propose the creation of a project on opensolaris.org, to bring to >> the community two Solaris host-based data services; namely volume >> snapshot and volume replication. These two data services exist today >> as the Sun StorageTek Availability Suite, a Solaris 8, 9 & 10, >> unbundled product set, consisting of Instant Image (II) and Network >> Data Replicator (SNDR). >> >> Project Description: >> >> Although Availability Suite is typically known as just two data >> services (II & SNDR), there is an underlying Solaris I/O filter >> driver framework which supports these two data services. This >> framework provides the means to stack one or more block-based, pseudo >> device drivers on to any pre-provisioned cb_ops structure, [ >> http://www.opensolaris.org/os/article/2005-03-31_inside_opensolaris__solaris_driver_programming/#datastructs >> ], thereby shunting all cb_ops I/O into the top of a developed filter >> driver, (for driver specific processing), then out the bottom of this >> filter driver, back into the original cb_ops entry points. >> >> Availability Suite was developed to interpose itself on the I/O stack >> of a block device, providing a filter driver framework with the means >> to intercept any I/O originating from an upstream file system, >> database or application layer I/O. This framework provided the means >> for Availability Suite to support snapshot and remote replication >> data services for UFS, QFS, VxFS, and more recently the ZFS file >> system, plus various databases like Oracle, Sybase and PostgreSQL, >> and also application I/Os. By providing a filter driver at this point >> in the Solaris I/O stack, it allows for any number of data services >> to be implemented, without regard to the underlying block storage >> that they will be configured on. Today, as a snapshot and/or >> replication solution, the framework allows both the source and >> destination block storage device to not only differ in physical >> characteristics (DAS, Fibre Channel, iSCSI, etc.), but also logical >> characteristics such as in RAID type, volume managed storage (i.e., >> SVM, VxVM), lofi, zvols, even ram disks. >> >> Community Involvement: >> >> By providing this filter-driver framework, two working filter drivers >> (II & SNDR), and an extensive collection of supporting software and >> utilities, it is envisioned that those individuals and companies that >> adopt OpenSolaris as a viable storage platform, will also utilize and >> enhance the existing II & SNDR data services, plus have offered to >> them the means in which to develop their own block-based filter >> driver(s), further enhancing the use and adoption on OpenSolaris. >> >> A very timely example that is very applicable to Availability Suite >> and the OpenSolaris community, is the recent announcement of the >> Project Proposal: lofi [ compression & encryption ] - >> http://www.opensolaris.org/jive/click.jspa&messageID=26841. By >> leveraging both the Availability Suite and the lofi OpenSolar
Re: [zfs-discuss] Re: ZFS or UFS - what to do?
On Jan 29, 2007, at 14:17, Jeffery Malloch wrote: Hi Guys, SO... From what I can tell from this thread ZFS if VERY fussy about managing writes,reads and failures. It wants to be bit perfect. So if you use the hardware that comes with a given solution (in my case an Engenio 6994) to manage failures you risk a) bad writes that don't get picked up due to corruption from write cache to disk b) failures due to data changes that ZFS is unaware of that the hardware imposes when it tries to fix itself. So now I have a $70K+ lump that's useless for what it was designed for. I should have spent $20K on a JBOD. But since I didn't do that, it sounds like a traditional model works best (ie. UFS et al) for the type of hardware I have. No sense paying for something and not using it. And by using ZFS just as a method for ease of file system growth and management I risk much more corruption. The other thing I haven't heard is why NOT to use ZFS. Or people who don't like it for some reason or another. Comments? I put together this chart a while back .. i should probably update it for RAID6 and RAIDZ2 # ZFS ARRAY HWCAPACITYCOMMENTS -- --- 1 R0 R1 N/2 hw mirror - no zfs healing 2 R0 R5 N-1 hw R5 - no zfs healing 3 R1 2 x R0 N/2 flexible, redundant, good perf 4 R1 2 x R5 (N/2)-1 flexible, more redundant, decent perf 5 R1 1 x R5 (N-1)/2 parity and mirror on same drives (XXX) 6 RZ R0 N-1 standard RAID-Z no mirroring 7 RZ R1 (tray) (N/2)-1 RAIDZ+1 8 RZ R1 (drives) (N/2)-1 RAID1+Z (highest redundancy) 9 RZ 3 x R5 N-4 triple parity calculations (XXX) 10 RZ 1 x R5 N-2 double parity calculations (XXX) (note: I included the cases where you have multiple arrays with a single lun per vdisk (say) and where you only have a single array split into multiple LUNs.) The way I see it, you're better off picking either controller parity or zfs parity .. there's no sense in computing parity multiple times unless you have cycles to spare and don't mind the performance hit .. so the questions you should really answer before you choose the hardware is what level of redundancy to capacity balance do you want? and whether or not you want to compute RAID in ZFS host memory or out on a dedicated blackbox controller? I would say something about double caching too, but I think that's moot since you'll always cache in the ARC if you use ZFS the way it's currently written. Other feasible filesystem options for Solaris - UFS, QFS, or vxfs with SVM or VxVM for volume mgmt if you're so inclined .. all depends on your budget and application. There's currently tradeoffs in each one, and contrary to some opinions, the death of any of these has been grossly exaggerated. --- .je ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS panics system during boot, after 11/06 upgrade
Hi, I'm looking for assistance troubleshooting an x86 laptop that I upgraded from Solaris 10 6/06 to 11/06 using standard upgrade. The upgrade went smoothly, but all attempts to boot it since then have failed. Every time, it panics, leaving a partial stack trace on the screen for a few seconds. The stack trace includes: zfs:vdev_mirror_open+44 zpool:vdev_open+b2 zpool:vdev_vertex_load+b4 zpool:vdev_graph_traverse+25 zpool:vdev_graph_load+2d zpool:spa_load+57 zpool:spa_open+10f zpool:spa_directory_next_pool+ca zpool:spa_open_all_pools+4f zpool:spa_name_lock+19 zpool:spa_open+7c zpool:spa_directory_next_pool+ca zpool:dmu_objset_find+1f4 zvol:zvol_attach+7b genunix:devi_attach+8f genunix:attach_node+71 genunix:i_ndi_config_node+ab genunix:i_ddi_attachchild+41 genunix:devi_attach_node+71 genunix:config_immediate_children+d7 genunix:devi_config_common+66 genunix:mt_config_thread+11a unix:thread_start+8 I can boot the system into Solaris failsafe mode and mount the root file system. There are ZFS file systems. There are no zones. Any help would be greatly appreciated, this is my everyday computer. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: ZFS panics system during boot, after 11/06 upgrade
> There are ZFS file systems. There are no zones. > > Any help would be greatly appreciated, this is my > everyday computer. Take a look at page 167 of the admin guide: http://opensolaris.org/os/community/zfs/docs/zfsadmin.pdf You need to delete /etc/zfs/zpool.cache. And, use zpool import to recover. Cheers, Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS panics system during boot, after 11/06 upgrade
Jim Walker wrote: There are ZFS file systems. There are no zones. Any help would be greatly appreciated, this is my everyday computer. Take a look at page 167 of the admin guide: http://opensolaris.org/os/community/zfs/docs/zfsadmin.pdf You need to delete /etc/zfs/zpool.cache. And, use zpool import to recover. Cheers, Jim Thanks. In Solaris Failsafe, I have mounted the root of the real OS instance onto /a. There is a /a/etc/zfs directory, but it is empty. No zpool.cache. The doc doesn't describe a solution for that situation. Any other ideas? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS or UFS - what to do?
On Mon, Jan 29, 2007 at 11:17:05AM -0800, Jeffery Malloch wrote: > From what I can tell from this thread ZFS if VERY fussy about > managing writes,reads and failures. It wants to be bit perfect. So > if you use the hardware that comes with a given solution (in my case > an Engenio 6994) to manage failures you risk a) bad writes that > don't get picked up due to corruption from write cache to disk b) > failures due to data changes that ZFS is unaware of that the > hardware imposes when it tries to fix itself. > > So now I have a $70K+ lump that's useless for what it was designed > for. I should have spent $20K on a JBOD. But since I didn't do > that, it sounds like a traditional model works best (ie. UFS et al) > for the type of hardware I have. No sense paying for something and > not using it. And by using ZFS just as a method for ease of file > system growth and management I risk much more corruption. Well, ZFS with HW RAID makes sense in some cases. However, it seems that if you are unwilling to lose 50% disk space to RAID 10 or two mirrored HW RAID arrays, you either use RAID 0 on the array with ZFS RAIDZ/RAIDZ2 on top of that or a JBOD with ZFS RAIDZ/RAIDZ2 on top of that. -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS or UFS - what to do?
On January 29, 2007 11:17:05 AM -0800 Jeffery Malloch <[EMAIL PROTECTED]> wrote: Hi Guys, SO... From what I can tell from this thread ZFS if VERY fussy about managing writes,reads and failures. It wants to be bit perfect. It's funny to call that "fussy". All filesystems WANT to be bit perfect, zfs actually does something to ensure it. So if you use the hardware that comes with a given solution (in my case an Engenio 6994) to manage failures you risk a) bad writes that don't get picked up due to corruption from write cache to disk You would always have that problem, JBOD or RAID. There are many places data can get corrupted, not just in the RAID write cache. zfs will correct it, or at least detect it depending on your configuration. b) failures due to data changes that ZFS is unaware of that the hardware imposes when it tries to fix itself. If that happens, you will be lucky to have ZFS to fix it. If the array changes data, it is broken. This is not the same thing as correcting data. The other thing I haven't heard is why NOT to use ZFS. Or people who don't like it for some reason or another. If you need per-user quotas, zfs might not be a good fit. (In many cases per-filesystem quotas can be used effectively though.) If you need NFS clients to traverse mount points on the server (eg /home/foo), then this won't work yet. Then again, does this work with UFS either? Seems to me it wouldn't. The difference is that zfs encourages you to create more filesystems. But you don't have to. If you have an application that is very highly tuned for a specific filesystem (e.g. UFS with directio), you might not want to replace it with zfs. If you need incremental restore, you might need to stick with UFS. (snapshots might be enough for you though) -frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS panics system during boot, after 11/06 upgrade
More diagnostic information: Before the afore-listed stack dump, the console displays many lines of text similar to the following, that scroll by very quickly. I was only able to capture them with the help of a digital camera. WARNING: kstat_create('unix', 0, zio_buf_#'): namespace_collision Jeff Victor wrote: Jim Walker wrote: There are ZFS file systems. There are no zones. Any help would be greatly appreciated, this is my everyday computer. Take a look at page 167 of the admin guide: http://opensolaris.org/os/community/zfs/docs/zfsadmin.pdf You need to delete /etc/zfs/zpool.cache. And, use zpool import to recover. Cheers, Jim Thanks. In Solaris Failsafe, I have mounted the root of the real OS instance onto /a. There is a /a/etc/zfs directory, but it is empty. No zpool.cache. The doc doesn't describe a solution for that situation. Any other ideas? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] dumpadm and using dumpfile on zfs?
Hi All, I'd like to set up dumping to a file. This file is on a mirrored pool using zfs. It seems that the dump setup doesn't work with zfs. This worked for both a standard UFS slice and a SVM mirror using zfs. Is there something that I'm doing wrong, or is this not yet supported on ZFS? Note this is Solaris 10 Update 3, but I don't think that should matter.. thanks, peter Using ZFS HON hcb116 ~ $ mkfile -n 1g /var/adm/crash/dump-file HON hcb116 ~ $ dumpadm -d /var/adm/crash/dump-file dumpadm: dumps not supported on /var/adm/crash/dump-file Using UFS HON hcb115 ~ $ mkfile -n 1g /data/0/test HON hcb115 ~ $ dumpadm -d /data/0/test Dump content: kernel pages Dump device: /data/0/test (dedicated) Savecore directory: /var/crash/stuff Savecore enabled: yes HON hcb115 ~ $ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dumpadm and using dumpfile on zfs?
Dumping to a file in a zfs file system is not supported yet. The zfs file system does not support the VOP_DUMP and VOP_DUMPCTL operations. This is bug 5008936 (ZFS and/or zvol should support dumps). Lori Peter Buckingham wrote: Hi All, I'd like to set up dumping to a file. This file is on a mirrored pool using zfs. It seems that the dump setup doesn't work with zfs. This worked for both a standard UFS slice and a SVM mirror using zfs. Is there something that I'm doing wrong, or is this not yet supported on ZFS? Note this is Solaris 10 Update 3, but I don't think that should matter.. thanks, peter Using ZFS HON hcb116 ~ $ mkfile -n 1g /var/adm/crash/dump-file HON hcb116 ~ $ dumpadm -d /var/adm/crash/dump-file dumpadm: dumps not supported on /var/adm/crash/dump-file Using UFS HON hcb115 ~ $ mkfile -n 1g /data/0/test HON hcb115 ~ $ dumpadm -d /data/0/test Dump content: kernel pages Dump device: /data/0/test (dedicated) Savecore directory: /var/crash/stuff Savecore enabled: yes HON hcb115 ~ $ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dumpadm and using dumpfile on zfs?
Hi Peter, This operation isn't supported yet. See this bug: http://bugs.opensolaris.org/view_bug.do?bug_id=5008936 Both the zfs man page and the ZFS Admin Guide identify swap and dump limitations, here: http://docs.sun.com/app/docs/doc/817-2271/6mhupg6gl?q=dump&a=view Cindy Peter Buckingham wrote: Hi All, I'd like to set up dumping to a file. This file is on a mirrored pool using zfs. It seems that the dump setup doesn't work with zfs. This worked for both a standard UFS slice and a SVM mirror using zfs. Is there something that I'm doing wrong, or is this not yet supported on ZFS? Note this is Solaris 10 Update 3, but I don't think that should matter.. thanks, peter Using ZFS HON hcb116 ~ $ mkfile -n 1g /var/adm/crash/dump-file HON hcb116 ~ $ dumpadm -d /var/adm/crash/dump-file dumpadm: dumps not supported on /var/adm/crash/dump-file Using UFS HON hcb115 ~ $ mkfile -n 1g /data/0/test HON hcb115 ~ $ dumpadm -d /data/0/test Dump content: kernel pages Dump device: /data/0/test (dedicated) Savecore directory: /var/crash/stuff Savecore enabled: yes HON hcb115 ~ $ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dumpadm and using dumpfile on zfs?
Lori Alt wrote: Dumping to a file in a zfs file system is not supported yet. The zfs file system does not support the VOP_DUMP and VOP_DUMPCTL operations. This is bug 5008936 (ZFS and/or zvol should support dumps). Ok, that's sort of what I expected thanks for the info. peter ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Problems adding drive
I attempted to increase my zraid from 2 disks to 3, but it looks like I added the drive outside of the raid: # zpool list NAMESIZEUSED AVAILCAP HEALTH ALTROOT amber 1.36T879G516G63% ONLINE - home 65.5G 1.30M 65.5G 0% ONLINE - [EMAIL PROTECTED]:/export/home/michael# [EMAIL PROTECTED]:/export/home/michael# zpool status pool: amber state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM amber ONLINE 0 0 0 raidz1ONLINE 0 0 0 c1d0ONLINE 0 0 0 c0d0ONLINE 0 0 0 c4d0 ONLINE 0 0 0 errors: No known data errors I can't even seem to get rid of c4d0, I have not written anything to "amber" since adding c4d0. Any suggestions on how to remove it and re add it correctly? Sincerely, Michael ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Adding my own compression to zfs
> Have a look at: > > http://blogs.sun.com/ahl/entry/a_little_zfs_hack thanks for the link, dick ! this sounds fantastic ! is the source for that (yet) available somewhere ? >Adam Leventhal's Weblog >inside the sausage factory btw - just wondering - is this some english phrase or some running gag ? i have seen it once ago on another blog and so i`m wondering greetings from the beer and sausage nation ;) roland This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Problems adding drive
[EMAIL PROTECTED] wrote on 01/29/2007 03:45:58 PM: > I attempted to increase my zraid from 2 disks to 3, but it looks > like I added the drive outside of the raid: > > # zpool list > > NAMESIZEUSED AVAILCAP HEALTH ALTROOT > amber 1.36T879G516G63% ONLINE - > home 65.5G 1.30M 65.5G 0% ONLINE - > [EMAIL PROTECTED]:/export/home/michael# > [EMAIL PROTECTED]:/export/home/michael# zpool status > pool: amber > state: ONLINE > scrub: none requested > config: > > NAMESTATE READ WRITE CKSUM > amber ONLINE 0 0 0 > raidz1ONLINE 0 0 0 > c1d0ONLINE 0 0 0 > c0d0ONLINE 0 0 0 > c4d0 ONLINE 0 0 0 > > errors: No known data errors > > > I can't even seem to get rid of c4d0, I have not written anything to > "amber" since adding c4d0. Any suggestions on how to remove it and > re add it correctly? > Sure, just run: zpool evacuate amber c4t0. =) Sorry. This was just in a few threads here, you will need to dump your data to tape (or another disk), destroy your pool and then recreate it. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re: Adding my own compression to zfs
roland wrote: Adam Leventhal's Weblog inside the sausage factory btw - just wondering - is this some english phrase or some running gag ? i have seen it once ago on another blog and so i`m wondering greetings from the beer and sausage nation ;) It's a response to a common English colloquialism which says 'nearly everybody likes eating sausage, but many people would probably rather not see how it's made'. Adam is a Sausage maker in the Solaris world. Open Solaris is the newly expanded, room for everyone, Solaris sausage factory. His blog covers topics relating to what goes on in his sausage making duties. - Matt p.s.: The web says a German word for colloquialism is umgangssprachlich. -- Matt Ingenthron - Web Infrastructure Solutions Architect Sun Microsystems, Inc. - Global Systems Practice http://blogs.sun.com/mingenthron/ email: [EMAIL PROTECTED] Phone: 310-242-6439 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] hot spares - in standby?
Hi, This is not exactly ZFS specific, but this still seems like a fruitful place to ask. It occurred to me today that hot spares could sit in standby (spun down) until needed (I know ATA can do this, I'm supposing SCSI does too, but I haven't looked at a spec recently). Does anybody do this? Or does everybody do this already? Does the tub curve (chance of early life failure) imply that hot spares should be burned in, instead of sitting there doing nothing from new? Just like a data disk, seems to me you'd want to know if a hot spare fails while waiting to be swapped in. Do they get tested periodically? --Toby ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spares - in standby?
You could easily do this in Solaris today by just using power.conf(4). Just have it spin down any drives that have been idle for a day or more. The periodic testing part would be an interesting project to kick off. --Bill On Mon, Jan 29, 2007 at 08:21:16PM -0200, Toby Thain wrote: > Hi, > > This is not exactly ZFS specific, but this still seems like a > fruitful place to ask. > > It occurred to me today that hot spares could sit in standby (spun > down) until needed (I know ATA can do this, I'm supposing SCSI does > too, but I haven't looked at a spec recently). Does anybody do this? > Or does everybody do this already? > > Does the tub curve (chance of early life failure) imply that hot > spares should be burned in, instead of sitting there doing nothing > from new? Just like a data disk, seems to me you'd want to know if a > hot spare fails while waiting to be swapped in. Do they get tested > periodically? > > --Toby > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Adding my own compression to zfs
The lzjb compression implementation (IMO) is the fastest one on SPARC Solaris systems. I've seen it beat lzo in speed while not necesarily in compressibility. I've measured both implementations inside Solaris SPARC kernels, and would love to hear from others about their experiences. As some one else alluded, multithreading the compression implementation will certainly improve performancel. Sri This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Adding my own compression to zfs
hey, thanks for your overwhelming private lesson for english colloquialism :D now back to the technical :) > # zfs create pool/gzip > # zfs set compression=gzip pool/gzip > # cp -r /pool/lzjb/* /pool/gzip > # zfs list > NAMEUSED AVAIL REFER MOUNTPOINT > pool/gzip 64.9M 33.2G 64.9M /pool/gzip > pool/lzjb 128M 33.2G 128M /pool/lzjb > > That's with a 1.2G crash dump (pretty much the most compressible file > imaginable). Here are the compression ratios with a pile of ELF binaries > (/usr/bin and /usr/lib): > # zfs get compressratio > NAME PROPERTY VALUE SOURCE > pool/gzip compressratio 3.27x - > pool/lzjb compressratio 1.89x - this looks MUCH better than i would have ever expected for smaller files. any real-world data how good or bad compressratio goes with lots of very small but good compressible files , for example some (evil for those solaris evangelists) untarred linux-source tree ? i'm rather excited how effective gzip will compress here. for comparison: sun1:/comptest # bzcat /tmp/linux-2.6.19.2.tar.bz2 |tar xvf - --snipp-- sun1:/comptest # du -s -k * 143895 linux-2.6.19.2 1 pax_global_header sun1:/comptest # du -s -k --apparent-size * 224282 linux-2.6.19.2 1 pax_global_header sun1:/comptest # zfs get compressratio comptest NAME PROPERTY VALUE SOURCE comptest tank compressratio 1.79x - This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re: Adding my own compression to zfs
On Mon, 2007-01-29 at 14:15 -0800, Matt Ingenthron wrote: > > > inside the sausage factory > > > > > > > btw - just wondering - is this some english phrase or some running gag ? i > > have seen it once ago on another blog and so i`m wondering > > > > greetings from the beer and sausage nation ;) > > > > > It's a response to a common English colloquialism which says 'nearly > everybody likes eating sausage, but many people would probably rather > not see how it's made'. I've actually seen the quote attributed to a German: Otto von Bismark, rendered in English as: "Laws are like sausages -- it is better not to see them being made." or "If you like laws and sausages, you should never watch either one being made." Of course, the same can, and has, been said about software... - Bill ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS or UFS - what to do?
Albert Chin said: > Well, ZFS with HW RAID makes sense in some cases. However, it seems that if > you are unwilling to lose 50% disk space to RAID 10 or two mirrored HW RAID > arrays, you either use RAID 0 on the array with ZFS RAIDZ/RAIDZ2 on top of > that or a JBOD with ZFS RAIDZ/RAIDZ2 on top of that. I've been re-evaluating our local decision on this question (how to layout ZFS on pre-existing RAID hardware). In our case, the array does not allow RAID-0 of any type, and we're unwilling to give up the expensive disk space to a mirrored configuration. In fact, in our last decision, we came to the conclusion that we didn't want to layer RAID-Z on top of HW RAID-5, thinking that the added loss of space is too high, given any of the "XXX" layouts in Jonathan Edwards' chart: > # ZFS ARRAY HWCAPACITYCOMMENTS > -- --- > . . . > 5 R1 1 x R5 (N-1)/2 parity and mirror on same drives (XXX) > 9 RZ 3 x R5 N-4 triple parity calculations (XXX) > . . . > 10 RZ 1 x R5 N-2 double parity calculations (XXX) So, we ended up (some months ago) deciding to go with only HW RAID-5, using ZFS to stripe together large-ish LUN's made up of independent HW RAID-5 groups. We'd have no ZFS redundancy, but at least ZFS would catch any corruption that may come along. We can restore individual corrupted files from tape backups (which we're already doing anyway), if necessary. However, given the default behavior of ZFS (as of Solaris-10U3) is to panic/halt when it encounters a corrupted block that it can't repair, I'm re-thinking our options, weighing against the possibility of a significant downtime caused by a single-block corruption. Today I've been pondering a variant of #10 above, the variation being to slice a RAID-5 volume across than N LUN's, i.e. LUN's smaller than the size of the individual disks that make up the HW R5 volume. A larger number of small LUN's results in less space given up to ZFS parity, which is nice when overall disk space is important to us. We're not expecting RAID-Z across these LUN's to make it possible to survive failure of a whole disk, rather we only "need" RAID-Z to repair the occasional block corruption, in the hopes that this might head off the need to restore a whole multi-TB pool. We'll rely on the HW RAID-5 to protect against whole-disk failure. Just thinking out loud here. Now I'm off to see what kind of performance cost there is, comparing (with 400GB disks): Simple ZFS stripe on one 2198GB LUN from a 6+1 HW RAID5 volume 8+1 RAID-Z on 9 244.2GB LUN's from a 6+1 HW RAID5 volume Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spares - in standby?
Toby Thain wrote: Hi, This is not exactly ZFS specific, but this still seems like a fruitful place to ask. It occurred to me today that hot spares could sit in standby (spun down) until needed (I know ATA can do this, I'm supposing SCSI does too, but I haven't looked at a spec recently). Does anybody do this? Or does everybody do this already? "luxadm stop" will work for many SCSI and FC JBODs. If your drive doesn't support it, it won't hurt anything, it will just claim "Unsupported" -- not very user friendly, IMHO. I think it is a good idea, with one potential gotcha. The gotcha is that it can take 30 seconds or more to spin up. By default, the sd and ssd timeouts are such that a pending iop will not notice that it took a while to spin up. However, if you have changed those defaults, as sometimes occurs in high availability requirements, then you probably shouldn't do this. Does the tub curve (chance of early life failure) imply that hot spares should be burned in, instead of sitting there doing nothing from new? Good question. If you consider that mechanical wear out is what ultimately causes many failure modes, then the argument can be made that a spun down disk should last longer. The problem is that there are failure modes which are triggered by a spin up. I've never seen field data showing the difference between the two. I spin mine down because they are too loud and consume more electricity, and electricity is expensive in Southern California. Just like a data disk, seems to me you'd want to know if a hot spare fails while waiting to be swapped in. Do they get tested periodically? Another good question. AFAIK, they are not accessed until needed. Note: they will be queried on boot which will cause a spin up. I use a cron job to spin mine down in the late evening. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spares - in standby?
On Mon, 29 Jan 2007, Toby Thain wrote: > Hi, > > This is not exactly ZFS specific, but this still seems like a > fruitful place to ask. > > It occurred to me today that hot spares could sit in standby (spun > down) until needed (I know ATA can do this, I'm supposing SCSI does > too, but I haven't looked at a spec recently). Does anybody do this? > Or does everybody do this already? I don't work with enough disk storage systems to know what is the industry norm. But there are 3 broad categories of disk drive spares: a) Cold Spare. A spare where the power is not connected until it is required. [1] b) Warm Spare. A spare that is active but placed into a low power mode. Or into a "low mechanical ware & tare" mode. In the case of a disk drive, the controller board is active but the HDA (Head Disk Assembly) is inactive (platters are stationary, heads unloaded [if the heads are physically unloaded]); it has power applied and can be made "hot" by a command over its data/command (bus) connection. The supervisorary hardware/software/firmware "knows" how long it *should* take the drive to go from warm to hot. c) Hot Spare. A spare that is spun up and ready to accept read/write/position (etc) requests. > Does the tub curve (chance of early life failure) imply that hot > spares should be burned in, instead of sitting there doing nothing > from new? Just like a data disk, seems to me you'd want to know if a > hot spare fails while waiting to be swapped in. Do they get tested > periodically? The ideal scenario, as you already allude to, would be for the disk subsystem to initially configure the drive as a hot spare and send it periodic "test" events for, say, the first 48 hours. This would get it past the first segment of the "bathtub" reliability curve - often referred to as the "infant mortality" phase. After that, (ideally) it would be placed into "warm standby" mode and it would be periodically tested (once a month??). If saving power was the highest priority, then the ideal situation would be where the disk subsystem could apply/remove power to the spare and move it from warm to cold upon command. One "trick" with disk subsystems, like ZFS that have yet to have the FMA type functionality added and which (today) provide for hot spares only, is to initially configure a pool with one (hot) spare, and then add a 2nd hot spare, based on installing a brand new device, say, 12 months later. And another spare 12 months later. What you are trying to achieve, with this strategy, is to avoid the scenario whereby mechanical systems, like disk drives, tend to "wear out" within the same general, relatively short, timeframe. One (obvious) issue with this strategy, is that it may be impossible to purchase the same disk drive 12 and 24 months later. However, it's always possible to purchase a larger disk drive and simply commit to the fact that the extra space provided by the newer drive will be wasted. [1] The most common example is a disk drive mounted on a carrier but not seated within the disk drive enclosure. Simple "push in" when required. Off Topic: To go off on a tangent - the same strategy applies to a UPS (Uninterruptable Power Supply). As per the following time line: year 0: purchase the UPS and one battery cabinet year 1: purchase and attach an additional battery cabinet year 2: purchase and attach an additional battery cabinet year 3: purchase and attach an additional battery cabinet year 4: purchase and attach an additional battery cabinet and remove the oldest battery cabinet year 5 ... N: repeat year 4s scenario until its time to replace the UPS. The advantage of this scenario is that you can budget a *fixed* cost for the UPS and your management understands that there is a recurring cost so that, when the power fails, your UPS will have working batteries!! Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spares - in standby?
On 29-Jan-07, at 9:04 PM, Al Hopper wrote: On Mon, 29 Jan 2007, Toby Thain wrote: Hi, This is not exactly ZFS specific, but this still seems like a fruitful place to ask. It occurred to me today that hot spares could sit in standby (spun down) until needed (I know ATA can do this, I'm supposing SCSI does too, but I haven't looked at a spec recently). Does anybody do this? Or does everybody do this already? I don't work with enough disk storage systems to know what is the industry norm. But there are 3 broad categories of disk drive spares: a) Cold Spare. A spare where the power is not connected until it is required. [1] b) Warm Spare. A spare that is active but placed into a low power mode. ... c) Hot Spare. A spare that is spun up and ready to accept read/write/position (etc) requests. Hi Al, Thanks for reminding me of the distinction. It seems very few installations would actually require (c)? Does the tub curve (chance of early life failure) imply that hot spares should be burned in, instead of sitting there doing nothing from new? Just like a data disk, seems to me you'd want to know if a hot spare fails while waiting to be swapped in. Do they get tested periodically? The ideal scenario, as you already allude to, would be for the disk subsystem to initially configure the drive as a hot spare and send it periodic "test" events for, say, the first 48 hours. For some reason that's a little shorter than I had in mind, but I take your word that that's enough burn-in for semiconductors, motors, servos, etc. This would get it past the first segment of the "bathtub" reliability curve ... If saving power was the highest priority, then the ideal situation would be where the disk subsystem could apply/remove power to the spare and move it from warm to cold upon command. I am surmising that it would also considerably increase the spare's useful lifespan versus "hot" and spinning. One "trick" with disk subsystems, like ZFS that have yet to have the FMA type functionality added and which (today) provide for hot spares only, is to initially configure a pool with one (hot) spare, and then add a 2nd hot spare, based on installing a brand new device, say, 12 months later. And another spare 12 months later. What you are trying to achieve, with this strategy, is to avoid the scenario whereby mechanical systems, like disk drives, tend to "wear out" within the same general, relatively short, timeframe. One (obvious) issue with this strategy, is that it may be impossible to purchase the same disk drive 12 and 24 months later. However, it's always possible to purchase a larger disk drive ...which is not guaranteed to be compatible with your storage subsystem...! --Toby and simply commit to the fact that the extra space provided by the newer drive will be wasted. [1] The most common example is a disk drive mounted on a carrier but not seated within the disk drive enclosure. Simple "push in" when required. ... Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spares - in standby?
Hi Guys, I seem to remember the Massive Array of Independent Disk guys ran into a problem I think they called static friction, where idle drives would fail on spin up after being idle for a long time: http://www.eweek.com/article2/0,1895,1941205,00.asp Would that apply here? Best Regards, Jason On 1/29/07, Toby Thain <[EMAIL PROTECTED]> wrote: On 29-Jan-07, at 9:04 PM, Al Hopper wrote: > On Mon, 29 Jan 2007, Toby Thain wrote: > >> Hi, >> >> This is not exactly ZFS specific, but this still seems like a >> fruitful place to ask. >> >> It occurred to me today that hot spares could sit in standby (spun >> down) until needed (I know ATA can do this, I'm supposing SCSI does >> too, but I haven't looked at a spec recently). Does anybody do this? >> Or does everybody do this already? > > I don't work with enough disk storage systems to know what is the > industry > norm. But there are 3 broad categories of disk drive spares: > > a) Cold Spare. A spare where the power is not connected until it is > required. [1] > > b) Warm Spare. A spare that is active but placed into a low power > mode. ... > > c) Hot Spare. A spare that is spun up and ready to accept > read/write/position (etc) requests. Hi Al, Thanks for reminding me of the distinction. It seems very few installations would actually require (c)? > >> Does the tub curve (chance of early life failure) imply that hot >> spares should be burned in, instead of sitting there doing nothing >> from new? Just like a data disk, seems to me you'd want to know if a >> hot spare fails while waiting to be swapped in. Do they get tested >> periodically? > > The ideal scenario, as you already allude to, would be for the disk > subsystem to initially configure the drive as a hot spare and send it > periodic "test" events for, say, the first 48 hours. For some reason that's a little shorter than I had in mind, but I take your word that that's enough burn-in for semiconductors, motors, servos, etc. > This would get it > past the first segment of the "bathtub" reliability curve ... > > If saving power was the highest priority, then the ideal situation > would > be where the disk subsystem could apply/remove power to the spare > and move > it from warm to cold upon command. I am surmising that it would also considerably increase the spare's useful lifespan versus "hot" and spinning. > > One "trick" with disk subsystems, like ZFS that have yet to have > the FMA > type functionality added and which (today) provide for hot spares > only, is > to initially configure a pool with one (hot) spare, and then add a > 2nd hot > spare, based on installing a brand new device, say, 12 months > later. And > another spare 12 months later. What you are trying to achieve, > with this > strategy, is to avoid the scenario whereby mechanical systems, like > disk > drives, tend to "wear out" within the same general, relatively short, > timeframe. > > One (obvious) issue with this strategy, is that it may be > impossible to > purchase the same disk drive 12 and 24 months later. However, it's > always > possible to purchase a larger disk drive ...which is not guaranteed to be compatible with your storage subsystem...! --Toby > and simply commit to the fact > that the extra space provided by the newer drive will be wasted. > > [1] The most common example is a disk drive mounted on a carrier > but not > seated within the disk drive enclosure. Simple "push in" when > required. > ... > Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] >Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT > OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 > OpenSolaris Governing Board (OGB) Member - Feb 2006 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spares - in standby?
On 29-Jan-07, at 11:02 PM, Jason J. W. Williams wrote: Hi Guys, I seem to remember the Massive Array of Independent Disk guys ran into a problem I think they called static friction, where idle drives would fail on spin up after being idle for a long time: You'd think that probably wouldn't happen to a spare drive that was spun up from time to time. In fact this problem would be (mitigated and/or) caught by the periodic health check I suggested. --T http://www.eweek.com/article2/0,1895,1941205,00.asp Would that apply here? Best Regards, Jason On 1/29/07, Toby Thain <[EMAIL PROTECTED]> wrote: On 29-Jan-07, at 9:04 PM, Al Hopper wrote: > On Mon, 29 Jan 2007, Toby Thain wrote: > >> Hi, >> >> This is not exactly ZFS specific, but this still seems like a >> fruitful place to ask. >> >> It occurred to me today that hot spares could sit in standby (spun >> down) until needed (I know ATA can do this, I'm supposing SCSI does >> too, but I haven't looked at a spec recently). Does anybody do this? >> Or does everybody do this already? > > I don't work with enough disk storage systems to know what is the > industry > norm. But there are 3 broad categories of disk drive spares: > > a) Cold Spare. A spare where the power is not connected until it is > required. [1] > > b) Warm Spare. A spare that is active but placed into a low power > mode. ... > > c) Hot Spare. A spare that is spun up and ready to accept > read/write/position (etc) requests. Hi Al, Thanks for reminding me of the distinction. It seems very few installations would actually require (c)? > >> Does the tub curve (chance of early life failure) imply that hot >> spares should be burned in, instead of sitting there doing nothing >> from new? Just like a data disk, seems to me you'd want to know if a >> hot spare fails while waiting to be swapped in. Do they get tested >> periodically? > > The ideal scenario, as you already allude to, would be for the disk > subsystem to initially configure the drive as a hot spare and send it > periodic "test" events for, say, the first 48 hours. For some reason that's a little shorter than I had in mind, but I take your word that that's enough burn-in for semiconductors, motors, servos, etc. > This would get it > past the first segment of the "bathtub" reliability curve ... > > If saving power was the highest priority, then the ideal situation > would > be where the disk subsystem could apply/remove power to the spare > and move > it from warm to cold upon command. I am surmising that it would also considerably increase the spare's useful lifespan versus "hot" and spinning. > > One "trick" with disk subsystems, like ZFS that have yet to have > the FMA > type functionality added and which (today) provide for hot spares > only, is > to initially configure a pool with one (hot) spare, and then add a > 2nd hot > spare, based on installing a brand new device, say, 12 months > later. And > another spare 12 months later. What you are trying to achieve, > with this > strategy, is to avoid the scenario whereby mechanical systems, like > disk > drives, tend to "wear out" within the same general, relatively short, > timeframe. > > One (obvious) issue with this strategy, is that it may be > impossible to > purchase the same disk drive 12 and 24 months later. However, it's > always > possible to purchase a larger disk drive ...which is not guaranteed to be compatible with your storage subsystem...! --Toby > and simply commit to the fact > that the extra space provided by the newer drive will be wasted. > > [1] The most common example is a disk drive mounted on a carrier > but not > seated within the disk drive enclosure. Simple "push in" when > required. > ... > Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] approach.com >Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT > OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 > OpenSolaris Governing Board (OGB) Member - Feb 2006 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spares - in standby?
On Jan 29, 2007, at 20:27, Toby Thain wrote: On 29-Jan-07, at 11:02 PM, Jason J. W. Williams wrote: I seem to remember the Massive Array of Independent Disk guys ran into a problem I think they called static friction, where idle drives would fail on spin up after being idle for a long time: You'd think that probably wouldn't happen to a spare drive that was spun up from time to time. In fact this problem would be (mitigated and/or) caught by the periodic health check I suggested. What about a rotating spare? When setting up a pool a lot of people would (say) balance things around buses and controllers to minimize single points of failure, and a rotating spare could disrupt this organization, but would it be useful at all? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spares - in standby?
On 1/30/07, David Magda <[EMAIL PROTECTED]> wrote: What about a rotating spare? When setting up a pool a lot of people would (say) balance things around buses and controllers to minimize single points of failure, and a rotating spare could disrupt this organization, but would it be useful at all? The costs involved in "rotating" spares in terms of IOPS reduction may not be worth it. -- Just me, Wire ... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spares - in standby?
Random thoughts: If we were to use some intelligence in the design, we could perhaps have a monitor that profiles the workload on the system (a pool, for example) over a [week|month|whatever] and selects a point in time, based on history, that it would expect the disks to be quite, and can 'pre-build' the spare with the contents of the disk it's about to swap out. At the point of switch-over, it could be pretty much instantaneous... It could also bail if it happened that the system actually started to get genuinely busy... That might actually be quite cool, though, if all disks are rotated, we end up with a whole bunch of disks that are evenly worn out again, which is just what we are really trying to avoid! ;) Nathan. Wee Yeh Tan wrote: On 1/30/07, David Magda <[EMAIL PROTECTED]> wrote: What about a rotating spare? When setting up a pool a lot of people would (say) balance things around buses and controllers to minimize single points of failure, and a rotating spare could disrupt this organization, but would it be useful at all? The costs involved in "rotating" spares in terms of IOPS reduction may not be worth it. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Project Proposal: Availability Suite
Jason, Thank you for the detailed explanation. It is very helpful to understand the issue. Is anyone successfully using SNDR with ZFS yet? Of the opportunities I've been involved with the answer is yes, but so far I've not seen SNDR with ZFS in a production environment, but that does not mean they don't exists. It was not until late June '06, that AVS 4.0, Solaris 10 and ZFS were generally available, and to date AVS has not been made available for the Solaris Express, Community Release, but it will be real soon. While I have your attention, there are two issues between ZFS and AVS that needs mentioning. 1). When ZFS is given an entire LUN to place in a ZFS storage pool, ZFS detect this, enabling SCSI write-caching on the LUN, and also opens the LUN with exclusive access, preventing other data services (like AVS) from accessing this device. The work-around is to manually format the LUN, typically placing all the blocks into a single partition, then just place this partition into the ZFS storage pool. ZFS detect this, not owning the entire LUN, and doesn't enable write-caching, which means it also doesn't open the LUN with exclusive access, and therefore AVS and ZFS can share the same LUN. I thought about submitting an RFE to have ZFS provide a means to override this restriction, but I am not 100% certain that a ZFS filesystem directly accessing a write-cached enabled LUN is the same thing as a replicated ZFS filesystem accessing a write-cached enabled LUN. Even though AVS is write-order consistent, there are disaster recovery scenarios, when enacted, where block-order, verses write-order I/Os are issued. 2). One has to be very cautious in using "zpool import -f " (forced import), especially on a LUN or LUNs in which SNDR is actively replicating into. If ZFS complains that the storage pool was not cleanly exported when issuing a "zpool import ...", and one attempts a "zpool import -f ", without checking the active replication state, they are sure to panic Solaris. Of course this failure scenario is no different then accessing a LUN or LUNs on dual-ported, or SAN based storage when another Solaris host is still accessing the ZFS filesystem, or controller based replication, as they are all just different operational scenarios of the same issue, data blocks changing out from underneath the ZFS filesystem, and its CRC checking mechanisms. Jim Best Regards, Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re: Adding my own compression to zfs
On Mon, Jan 29, 2007 at 02:39:13PM -0800, roland wrote: > > # zfs get compressratio > > NAME PROPERTY VALUE SOURCE > > pool/gzip compressratio 3.27x - > > pool/lzjb compressratio 1.89x - > > this looks MUCH better than i would have ever expected for smaller files. > > any real-world data how good or bad compressratio goes with lots of very > small but good compressible files , for example some (evil for those solaris > evangelists) untarred linux-source tree ? > > i'm rather excited how effective gzip will compress here. > > for comparison: > > sun1:/comptest # bzcat /tmp/linux-2.6.19.2.tar.bz2 |tar xvf - > --snipp-- > > sun1:/comptest # du -s -k * > 143895 linux-2.6.19.2 > 1 pax_global_header > > sun1:/comptest # du -s -k --apparent-size * > 224282 linux-2.6.19.2 > 1 pax_global_header > > sun1:/comptest # zfs get compressratio comptest > NAME PROPERTY VALUE SOURCE > comptest tank compressratio 1.79x - Don't start sending me your favorite files to compress (it really should work about the same as gzip), but here's the result for the above (I found a tar file that's about 235M uncompressed): # du -ks linux-2.6.19.2/ 80087 linux-2.6.19.2 # zfs get compressratio pool/gzip NAME PROPERTY VALUE SOURCE pool/gzip compressratio 3.40x - Doing a gzip with the default compression level (6 -- the same setting I'm using in ZFS) yields a file that's about 52M. The small files are hurting a bit here, but it's still pretty good -- and considerably better than LZJB. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Project Proposal: Availability Suite
Hi Jim, Thank you very much for the heads up. Unfortunately, we need the write-cache enabled for the application I was thinking of combining this with. Sounds like SNDR and ZFS need some more soak time together before you can use both to their full potential together? Best Regards, Jason On 1/29/07, Jim Dunham <[EMAIL PROTECTED]> wrote: Jason, > Thank you for the detailed explanation. It is very helpful to > understand the issue. Is anyone successfully using SNDR with ZFS yet? Of the opportunities I've been involved with the answer is yes, but so far I've not seen SNDR with ZFS in a production environment, but that does not mean they don't exists. It was not until late June '06, that AVS 4.0, Solaris 10 and ZFS were generally available, and to date AVS has not been made available for the Solaris Express, Community Release, but it will be real soon. While I have your attention, there are two issues between ZFS and AVS that needs mentioning. 1). When ZFS is given an entire LUN to place in a ZFS storage pool, ZFS detect this, enabling SCSI write-caching on the LUN, and also opens the LUN with exclusive access, preventing other data services (like AVS) from accessing this device. The work-around is to manually format the LUN, typically placing all the blocks into a single partition, then just place this partition into the ZFS storage pool. ZFS detect this, not owning the entire LUN, and doesn't enable write-caching, which means it also doesn't open the LUN with exclusive access, and therefore AVS and ZFS can share the same LUN. I thought about submitting an RFE to have ZFS provide a means to override this restriction, but I am not 100% certain that a ZFS filesystem directly accessing a write-cached enabled LUN is the same thing as a replicated ZFS filesystem accessing a write-cached enabled LUN. Even though AVS is write-order consistent, there are disaster recovery scenarios, when enacted, where block-order, verses write-order I/Os are issued. 2). One has to be very cautious in using "zpool import -f " (forced import), especially on a LUN or LUNs in which SNDR is actively replicating into. If ZFS complains that the storage pool was not cleanly exported when issuing a "zpool import ...", and one attempts a "zpool import -f ", without checking the active replication state, they are sure to panic Solaris. Of course this failure scenario is no different then accessing a LUN or LUNs on dual-ported, or SAN based storage when another Solaris host is still accessing the ZFS filesystem, or controller based replication, as they are all just different operational scenarios of the same issue, data blocks changing out from underneath the ZFS filesystem, and its CRC checking mechanisms. Jim > > Best Regards, > Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spares - in standby?
Hi Toby, You're right. The healthcheck would definitely find any issues. I misinterpreted your comment to that effect as a question and didn't quite latch on. A zpool MAID-mode with that healthcheck might also be interesting on something like a Thumper for pure-archival, D2D backup work. Would dramatically cut down on the power. What do y'all think? Best Regards, Jason On 1/29/07, Toby Thain <[EMAIL PROTECTED]> wrote: On 29-Jan-07, at 11:02 PM, Jason J. W. Williams wrote: > Hi Guys, > > I seem to remember the Massive Array of Independent Disk guys ran into > a problem I think they called static friction, where idle drives would > fail on spin up after being idle for a long time: You'd think that probably wouldn't happen to a spare drive that was spun up from time to time. In fact this problem would be (mitigated and/or) caught by the periodic health check I suggested. --T > http://www.eweek.com/article2/0,1895,1941205,00.asp > > Would that apply here? > > Best Regards, > Jason > > On 1/29/07, Toby Thain <[EMAIL PROTECTED]> wrote: >> >> On 29-Jan-07, at 9:04 PM, Al Hopper wrote: >> >> > On Mon, 29 Jan 2007, Toby Thain wrote: >> > >> >> Hi, >> >> >> >> This is not exactly ZFS specific, but this still seems like a >> >> fruitful place to ask. >> >> >> >> It occurred to me today that hot spares could sit in standby (spun >> >> down) until needed (I know ATA can do this, I'm supposing SCSI >> does >> >> too, but I haven't looked at a spec recently). Does anybody do >> this? >> >> Or does everybody do this already? >> > >> > I don't work with enough disk storage systems to know what is the >> > industry >> > norm. But there are 3 broad categories of disk drive spares: >> > >> > a) Cold Spare. A spare where the power is not connected until >> it is >> > required. [1] >> > >> > b) Warm Spare. A spare that is active but placed into a low power >> > mode. ... >> > >> > c) Hot Spare. A spare that is spun up and ready to accept >> > read/write/position (etc) requests. >> >> Hi Al, >> >> Thanks for reminding me of the distinction. It seems very few >> installations would actually require (c)? >> >> > >> >> Does the tub curve (chance of early life failure) imply that hot >> >> spares should be burned in, instead of sitting there doing nothing >> >> from new? Just like a data disk, seems to me you'd want to know >> if a >> >> hot spare fails while waiting to be swapped in. Do they get tested >> >> periodically? >> > >> > The ideal scenario, as you already allude to, would be for the disk >> > subsystem to initially configure the drive as a hot spare and >> send it >> > periodic "test" events for, say, the first 48 hours. >> >> For some reason that's a little shorter than I had in mind, but I >> take your word that that's enough burn-in for semiconductors, motors, >> servos, etc. >> >> > This would get it >> > past the first segment of the "bathtub" reliability curve ... >> > >> > If saving power was the highest priority, then the ideal situation >> > would >> > be where the disk subsystem could apply/remove power to the spare >> > and move >> > it from warm to cold upon command. >> >> I am surmising that it would also considerably increase the spare's >> useful lifespan versus "hot" and spinning. >> >> > >> > One "trick" with disk subsystems, like ZFS that have yet to have >> > the FMA >> > type functionality added and which (today) provide for hot spares >> > only, is >> > to initially configure a pool with one (hot) spare, and then add a >> > 2nd hot >> > spare, based on installing a brand new device, say, 12 months >> > later. And >> > another spare 12 months later. What you are trying to achieve, >> > with this >> > strategy, is to avoid the scenario whereby mechanical systems, like >> > disk >> > drives, tend to "wear out" within the same general, relatively >> short, >> > timeframe. >> > >> > One (obvious) issue with this strategy, is that it may be >> > impossible to >> > purchase the same disk drive 12 and 24 months later. However, it's >> > always >> > possible to purchase a larger disk drive >> >> ...which is not guaranteed to be compatible with your storage >> subsystem...! >> >> --Toby >> >> > and simply commit to the fact >> > that the extra space provided by the newer drive will be wasted. >> > >> > [1] The most common example is a disk drive mounted on a carrier >> > but not >> > seated within the disk drive enclosure. Simple "push in" when >> > required. >> > ... >> > Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] >> approach.com >> >Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT >> > OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 >> > OpenSolaris Governing Board (OGB) Member - Feb 2006 >> >> ___ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> ___ zfs-discus
Re: [zfs-discuss] Re: Re: ZFS or UFS - what to do?
On 29/01/2007, at 12:50 AM, [EMAIL PROTECTED] wrote: On 28-Jan-07, at 7:59 AM, [EMAIL PROTECTED] wrote: On 27-Jan-07, at 10:15 PM, Anantha N. Srirama wrote: ... ZFS will not stop alpha particle induced memory corruption after data has been received by server and verified to be correct. Sadly I've been hit with that as well. My brother points out that you can use a rad hardened CPU. ECC should take care of the RAM. :-) I wonder when the former will become data centre best practice? Alpha particles which "hit" CPUs must have their origin inside said CPU. (Alpha particles do not penentrate skin, paper, let alone system cases or CPU packagaging) Thanks. But what about cosmic rays? I was just in pedantic mode; "cosmic rays" is the term covering all different particles, including alpha, beta and gamma rays. Alpha rays don't reach us from the "cosmos"; they are caught long before they can do any harm. Ditto beta rays. Both have an electrical charge that makes passing magnetic fields or passing through materials difficult. Both do exist "in the free" but are commonly caused by slow radioactive decay of our natural environment. Gamma rays are photons with high energy; they are not capture by magnetic fields (such as those existing in atoms: electons, protons). They need to take a direct hit before they're stopped; they can only be stopped by dense materials, such as lead. Unfortunately, natural occuring lead is polluted by pollonium and uranium and is an alpha/ beta source in its own right. That's why 100 year old lead from roofs is worth more money than new lead: it's radioisotopes have been depleted. Ok, I'll bite. It's been a long day, so that may be why I can't see why the radioisotopes in lead that was dug up 100 years ago would be any more depleted than the lead that sat in the ground for the intervening 100 years. Half-life is half-life, no? Now if it were something about the modern extraction process that added contaminants, then I can see it. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss