[zfs-discuss] SPAM *** Re: [osol-help] Adding a new partition to the system

2009-02-14 Thread Jan Hlodan
Antonio wrote:
> Hi all,
>
> First of all let me say that, after a few days using it (and after several 
> *years* of using Linux daily),  I'm delighted with OpenSolaris 8.11. It's 
> gonna be the OS of my choice.
>
> The fact is that I installed it in a partition of 16Gb in my hard disk and 
> that I'd like to add another partition to the system (I have different 
> partitions with Linux and Windows and some others).
>
> So the questions are:
>
> 1.- How do I add an existing partition to OpenSolaris? (Should I change the 
> partition type or something? Shall I "grow" ZFS or shall I mount the extra 
> partition somewhere else?)
>
>   
yes. You can create a new zpool from your free/spare partition.
I had the same problem. I wanted to use Linux partition as a mirror.
So here is how to:
Follow this blog -
http://blogs.sun.com/pradhap/entry/mount_ntfs_ext2_ext3_in
* install FSWpart and FSWfsmisc
* run prtpart (find out your disk ID)
* figure out partitions ID: prtpart "disk ID" -ldevs
* create zpool from linux partition  e.g. zpool create trunk /dev/dsk/c9d0p3
* check it out: zpool list or zpool status


> 2.- Would you please recommend a good introduction to Solaris/OpenSolaris? 
> I'm used to Linux and I'd like to get up to speed with OpenSolaris.
>   
sure, OpenSolaris Bible :)
http://blogs.sun.com/observatory/entry/two_more_chapters_from_the

Hope this helps,

Regards,

Jan Hlodan

> Thanks in advance,
> Antonio
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on SAN?

2009-02-14 Thread Toby Thain


On 14-Feb-09, at 2:40 AM, Andras Spitzer wrote:


Damon,

Yes, we can provide simple concat inside the array (even though  
today we provide RAID5 or RAID1 as our standard, and using Veritas  
with concat), the question is more of if it's worth it to switch  
the redundancy from the array to the ZFS layer.


The RAID5/1 features of the high-end EMC arrays also provide  
performance improvements, that's why I wonder what would be the  
pros/cons of such a switch (I mean the switch of the redundancy  
from the array to the ZFS layer).


So, you telling me that even if the SAN provides redundancy (HW  
RAID5 or RAID1), people still configure ZFS with either raidz or  
mirror?


Without doing so, you don't get the benefit of checksummed self-healing.

--Toby



Regards,
sendai

On Sat, Feb 14, 2009 at 6:06 AM, Damon Atkins  
 wrote:

Andras,
 It you can get Concat Disk or Raid 0 Disk inside the array, then  
use RaidZ
(if I/O is not large amount or its mostly sequential) if very high  
I/O then
use ZFS Mirror. You can not spread a zpool over multiple EMC  
Arrays using

SRDF if you are not using EMC Power Path.

HDS for example does not support anything other than Mirror or RAID5
configuration, so RaidZ or ZFS Mirror results in a lot of wasted  
disk space.
 However people still use RaidZ on HDS Raid5. As the top of the  
line HDS

arrays are very fast and they want the features offered by ZFS.

Cheers



--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] SPAM *** Re: [osol-help] Adding a new partition to the system

2009-02-14 Thread Jan Hlodan

Hi Antonio,

did you try to recreate this partition e.g. with Gparted?
Maybe is something wrong with this partition.
Can you also post what "prtpart "disk ID" -ldevs" says?

Regards,

Jan Hlodan

Antonio wrote:

Hi Jan,

I tried out what you say long ago, but zfs fails on pool creation.

This is, when I issue the zpool create trunk /dev/dsk/c9d0p3 the 
command fails saying that there's no such file or directory. And the 
disk is correct!!


What I think is that /dev/dsk/c9d0p3 is a symbolic name used by 
FSWpart, and it's not a valid device name for zpool.


Thanks anyway,
Antonio

Jan Hlodan escribió:

Antonio wrote:

Hi all,

First of all let me say that, after a few days using it (and after 
several *years* of using Linux daily),  I'm delighted with 
OpenSolaris 8.11. It's gonna be the OS of my choice.


The fact is that I installed it in a partition of 16Gb in my hard 
disk and that I'd like to add another partition to the system (I 
have different partitions with Linux and Windows and some others).


So the questions are:

1.- How do I add an existing partition to OpenSolaris? (Should I 
change the partition type or something? Shall I "grow" ZFS or shall 
I mount the extra partition somewhere else?)


  

yes. You can create a new zpool from your free/spare partition.
I had the same problem. I wanted to use Linux partition as a mirror.
So here is how to:
Follow this blog -
http://blogs.sun.com/pradhap/entry/mount_ntfs_ext2_ext3_in
* install FSWpart and FSWfsmisc
* run prtpart (find out your disk ID)
* figure out partitions ID: prtpart "disk ID" -ldevs
* create zpool from linux partition  e.g. zpool create trunk 
/dev/dsk/c9d0p3

* check it out: zpool list or zpool status


2.- Would you please recommend a good introduction to 
Solaris/OpenSolaris? I'm used to Linux and I'd like to get up to 
speed with OpenSolaris.
  

sure, OpenSolaris Bible :)
http://blogs.sun.com/observatory/entry/two_more_chapters_from_the

Hope this helps,

Regards,

Jan Hlodan


Thanks in advance,
Antonio
  




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on SAN?

2009-02-14 Thread Mika Borner

Andras Spitzer wrote:

Is it worth to move the redundancy from the SAN array layer to the ZFS layer? 
(configuring redundancy on both layers is sounds like a waste to me)  There are 
certain advantages on the array to have redundancy configured (beyond the 
protection against simple disk failure). Can we compare the advantages of 
having  (for example) RAID5 configured on a high-end SAN with no redundancy at 
the ZFS layer versus no redundant RAID configuration on the high-end SAN but 
having raidz or raidz2 on the ZFS layer?

Any tests, experience or best practices regarding this topic?


  

Would also like to hear about experiences with ZFS on EMC's Symmetrix.

Currently we are using VxFS with Powerpath for multipathing, and 
synchronous SRDF for replication to our other datacenter.


At some point we will move to ZFS, but there are so many options how to 
implement this.


From a sysadmin point of view (simplicity), I would like to use mpxio 
and host based mirroring. ZFS self-healing would be available in this 
configuration.


Asking EMC guys for their opinion is not an option. They will push you 
to buy SRDF and Powerpath licenses... :-)


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS: unreliable for professional usage?

2009-02-14 Thread Bob Friesenhahn

On Fri, 13 Feb 2009, Frank Cusack wrote:


i'm sorry to berate you, as you do make very valuable contributions to
the discussion here, but i take offense at your attempts to limit
discussion simply because you know everything there is to know about
the subject.


The point is that those of us in the chattering class (i.e. people 
like you and me) clearly know very little about the subject, and 
continuting to chatter among ourselves is soon no longer rewarding.


Bob
==
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on SAN?

2009-02-14 Thread Bob Friesenhahn

On Fri, 13 Feb 2009, Andras Spitzer wrote:


So, you telling me that even if the SAN provides redundancy (HW 
RAID5 or RAID1), people still configure ZFS with either raidz or 
mirror?


When ZFS's redundancy features are used, there is decreased risk of 
total pool failure.  With redundancy at the ZFS level, errors may be 
corrected.  With care in the pool design, more overall performance may 
be obtained since a number of independent arrays may be pooled 
together to obtain more bandwidth and storage space.


With this in mind, if the SAN hardware is known to work very well, 
placing the pool on a single SAN device is still an option.


If you do use ZFS's redundancy features, it is important to consider 
resilver time.  Try to keep volume size small enough that it may be 
resilvered in a reasonable amount of time.


Bob
==
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs destroy hanging

2009-02-14 Thread Blake
I think you can kill the destroy command process using traditional methods.

Perhaps your slowness issue is because the pool is an older format.
I've not had these problems since upgrading to the zfs version that
comes default with 2008.11


On Fri, Feb 13, 2009 at 4:14 PM, David Dyer-Bennet  wrote:
> This shouldn't be taking anywhere *near* half an hour.  The snapshots
> differ trivially, by one or two files and less than 10k of data (they're
> test results from working on my backup script).  But so far, it's still
> sitting there after more than half an hour.
>
> local...@fsfs:~/src/bup2# zfs destroy ruin/export
> cannot destroy 'ruin/export': filesystem has children
> use '-r' to destroy the following datasets:
> ruin/export/h...@bup-20090210-202557utc
> ruin/export/h...@20090210-213902utc
> ruin/export/home/local...@first
> ruin/export/home/local...@second
> ruin/export/home/local...@bup-20090210-202557utc
> ruin/export/home/local...@20090210-213902utc
> ruin/export/home/localddb
> ruin/export/home
> local...@fsfs:~/src/bup2# zfs destroy -r ruin/export
>
> It's still hung.
>
> Ah, here's zfs list output from shortly before I started the destroy:
>
> ruin 474G   440G   431G  /backups/ruin
> ruin/export 35.0M   440G18K  /backups/ruin/export
> ruin/export/home35.0M   440G19K  /export/home
> ruin/export/home/localddb 35M   440G  27.8M  /export/home/localddb
>
> As you can see, the ruin/export/home filesystem (and subs) is NOT large.
>
> iostat shows no activity on pool ruin over a minute.
>
> local...@fsfs:~$ pfexec zpool iostat ruin 10
>   capacity operationsbandwidth
> pool used  avail   read  write   read  write
> --  -  -  -  -  -  -
> ruin 474G   454G 10  0  1.13M840
> ruin 474G   454G  0  0  0  0
> ruin 474G   454G  0  0  0  0
> ruin 474G   454G  0  0  0  0
> ruin 474G   454G  0  0  0  0
> ruin 474G   454G  0  0  0  0
> ruin 474G   454G  0  0  0  0
> ruin 474G   454G  0  0  0  0
> ruin 474G   454G  0  0  0  0
>
> The pool still thinks it is healthy.
>
> local...@fsfs:~$ zpool status -v ruin
>  pool: ruin
>  state: ONLINE
> status: The pool is formatted using an older on-disk format.  The pool can
>still be used, but some features are unavailable.
> action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
>pool will no longer be accessible on older software versions.
>  scrub: scrub completed after 4h42m with 0 errors on Mon Feb  9 19:10:49 2009
> config:
>
>NAMESTATE READ WRITE CKSUM
>ruinONLINE   0 0 0
>  c7t0d0ONLINE   0 0 0
>
> errors: No known data errors
>
> There is still a process out there trying to run that destroy.  It doesn't
> appear to be using much cpu time.
>
> local...@fsfs:~$ ps -ef | grep zfs
> localddb  7291  7228   0 15:10:56 pts/4   0:00 grep zfs
>root  7223  7101   0 14:18:27 pts/3   0:00 zfs destroy -r ruin/export
>
> Running 2008.11.
>
> local...@fsfs:~$ uname -a
> SunOS fsfs 5.11 snv_101b i86pc i386 i86pc Solaris
>
> Any suggestions?  Eventually I'll kill the process by the gentlest way
> that works, I suppose (if it doesn't complete).
> --
> David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
> Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
> Photos: http://dd-b.net/photography/gallery/
> Dragaera: http://dragaera.info
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] SSD - slow down with age

2009-02-14 Thread Nicholas Lee
A useful article about long term use of the Intel SSD X25-M:
http://www.pcper.com/article.php?aid=669 -  Long-term performance analysis
of Intel Mainstream SSDs.
Would a zfs cache (ZIL or ARC) based on a SSD device see this kind of issue?
 Maybe a periodic scrub via a full disk erase would be a useful process.

Nicholas
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] SPAM *** Re: [osol-help] Adding a new partition to the system

2009-02-14 Thread Jan Hlodan

Antonio wrote:
I can mount those partitions well using ext2fs, so I assume I won't 
need to run gparted at all.


This is what prtpart says about my stuff.

Kind regards,
Antonio

r...@antonio:~# prtpart /dev/rdsk/c3d0p0 -ldevs

Fdisk information for device /dev/rdsk/c3d0p0

** NOTE **
/dev/dsk/c3d0p0  - Physical device referring to entire physical disk
/dev/dsk/c3d0p1 - p4 - Physical devices referring to the 4 primary 
partitions

/dev/dsk/c3d0p5 ...  - Virtual devices referring to logical partitions

Virtual device names can be used to access EXT2 and NTFS on logical 
partitions


/dev/dsk/c3d0p1Solaris x86
/dev/dsk/c3d0p2Solaris x86
/dev/dsk/c3d0p3Solaris x86
/dev/dsk/c3d0p4DOS Extended
/dev/dsk/c3d0p5Linux native
/dev/dsk/c3d0p6Linux native
/dev/dsk/c3d0p7Linux native
/dev/dsk/c3d0p8Linux native
/dev/dsk/c3d0p9Linux swap
/dev/dsk/c3d0p10Solaris x86


Hi Antonio,

and what does 'zpool create' command say?
$ pfexec zpool create test /dev/dsk/c3d0p5
or
$ pfexec zpool create -f test /dev/dsk/c3d0p5

Regards,

jh




Jan Hlodan escribió:

Hi Antonio,

did you try to recreate this partition e.g. with Gparted?
Maybe is something wrong with this partition.
Can you also post what "prtpart "disk ID" -ldevs" says?

Regards,

Jan Hlodan

Antonio wrote:

Hi Jan,

I tried out what you say long ago, but zfs fails on pool creation.

This is, when I issue the zpool create trunk /dev/dsk/c9d0p3 the 
command fails saying that there's no such file or directory. And the 
disk is correct!!


What I think is that /dev/dsk/c9d0p3 is a symbolic name used by 
FSWpart, and it's not a valid device name for zpool.


Thanks anyway,
Antonio

Jan Hlodan escribió:

Antonio wrote:

Hi all,

First of all let me say that, after a few days using it (and after 
several *years* of using Linux daily),  I'm delighted with 
OpenSolaris 8.11. It's gonna be the OS of my choice.


The fact is that I installed it in a partition of 16Gb in my hard 
disk and that I'd like to add another partition to the system (I 
have different partitions with Linux and Windows and some others).


So the questions are:

1.- How do I add an existing partition to OpenSolaris? (Should I 
change the partition type or something? Shall I "grow" ZFS or 
shall I mount the extra partition somewhere else?)


  

yes. You can create a new zpool from your free/spare partition.
I had the same problem. I wanted to use Linux partition as a mirror.
So here is how to:
Follow this blog -
http://blogs.sun.com/pradhap/entry/mount_ntfs_ext2_ext3_in
* install FSWpart and FSWfsmisc
* run prtpart (find out your disk ID)
* figure out partitions ID: prtpart "disk ID" -ldevs
* create zpool from linux partition  e.g. zpool create trunk 
/dev/dsk/c9d0p3

* check it out: zpool list or zpool status


2.- Would you please recommend a good introduction to 
Solaris/OpenSolaris? I'm used to Linux and I'd like to get up to 
speed with OpenSolaris.
  

sure, OpenSolaris Bible :)
http://blogs.sun.com/observatory/entry/two_more_chapters_from_the

Hope this helps,

Regards,

Jan Hlodan


Thanks in advance,
Antonio
  






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs destroy hanging

2009-02-14 Thread David Dyer-Bennet

On Sat, February 14, 2009 13:04, Blake wrote:
> I think you can kill the destroy command process using traditional
> methods.

kill and kill -9 failed.  In fact, rebooting failed; I had to use a hard
reset (it shut down most of the way, but then got stuck).

> Perhaps your slowness issue is because the pool is an older format.
> I've not had these problems since upgrading to the zfs version that
> comes default with 2008.11

We can hope.  In case that's the cause, I upgraded the pool format (after
considering whether I'd be needing to access it with older software; hope
I was right :-)).

The pool did import and scrub cleanly, anyway.  That's hopeful.  Also this
particular pool is a scratch pool at the moment, so I'm not risking losing
data, only risking losing confidence in ZFS.  It's also a USB external
disk.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs destroy hanging

2009-02-14 Thread James C. McPherson
On Sat, 14 Feb 2009 15:40:04 -0600 (CST)
David Dyer-Bennet  wrote:

> 
> On Sat, February 14, 2009 13:04, Blake wrote:
> > I think you can kill the destroy command process using traditional
> > methods.
> 
> kill and kill -9 failed.  In fact, rebooting failed; I had to use a
> hard reset (it shut down most of the way, but then got stuck).
> 
> > Perhaps your slowness issue is because the pool is an older format.
> > I've not had these problems since upgrading to the zfs version that
> > comes default with 2008.11
> 
> We can hope.  In case that's the cause, I upgraded the pool format
> (after considering whether I'd be needing to access it with older
> software; hope I was right :-)).
> 
> The pool did import and scrub cleanly, anyway.  That's hopeful.  Also
> this particular pool is a scratch pool at the moment, so I'm not
> risking losing data, only risking losing confidence in ZFS.  It's
> also a USB external disk.

Hi David,
if this happens to you again, you could help get more
data on the problem by getting a crash dump, either forced
or via reboot or (if you have a dedicated dump device, via
savecore:

(dedicated dump dev, )
# savecore -L /var/crash/`uname -n`

or

# reboot -dq


(forced, 64bit mode)
# echo "0>rip"|mdb -kw

(forced, 32bit mode)
# echo "0>eip"|mdb -kw


Try the command line options first, only use the mdb
kick in the guts if the other two fail.

Once you've got the core, you could post the output of

::status
$C

when run over the core with mdb -k.



James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS: unreliable for professional usage?

2009-02-14 Thread Ross Smith
Hey guys,

I'll let this die in a sec, but I just wanted to say that I've gone
and read the on disk document again this morning, and to be honest
Richard, without the description you just wrote, I really wouldn't
have known that uberblocks are in a 128 entry circular queue that's 4x
redundant.

Please understand that I'm not asking for answers to these notes, this
post is purely to illustrate to you ZFS guys that much as I appreciate
having the ZFS docs available, they are very tough going for anybody
who isn't a ZFS developer.  I consider myself well above average in IT
ability, and I've really spent quite a lot of time in the past year
reading around ZFS, but even so I would definitely have come to the
wrong conclusion regarding uberblocks.

Richard's post I can understand really easily, but in the on disk
format docs, that information is spread over 7 pages of really quite
technical detail, and to be honest, for a user like myself raises as
many questions as it answers:

On page 6 I learn that labels are stored on each vdev, as well as each
disk.  So there will be a label on the pool, mirror (or raid group),
and disk.  I know the disk ones are at the start and end of the disk,
and it sounds like the mirror vdev is in the same place, but where is
the root vdev label?  The example given doesn't mention its location
at all.

Then, on page 7 it sounds like the entire label is overwriten whenever
on-disk data is updated - "any time on-disk data is overwritten, there
is potential for error".  To me, it sounds like it's not a 128 entry
queue, but just a group of 4 labels, all of which are overwritten as
data goes to disk.

Then finally, on page 12 the uberblock is mentioned (although as an
aside, the first time I read these docs I had no idea what the
uberblock actually was).  It does say that only one uberblock is
active at a time, but with it being part of the label I'd just assume
these were overwritten as a group..

And that's why I'll often throw ideas out - I can either rely on my
own limited knowledge of ZFS to say if it will work, or I can take
advantage of the excellent community we have here, and post the idea
for all to see.  It's a quick way for good ideas to be improved upon,
and bad ideas consigned to the bin.  I've done it before in my rather
lengthly 'zfs availability' thread.  My thoughts there were thrashed
out nicely, with some quite superb additions (namely the concept of
lop sided mirrors which I think are a great idea).

Ross

PS.  I've also found why I thought you had to search for these blocks,
it was after reading this thread where somebody used mdb to search a
corrupt pool to try to recover data:
http://opensolaris.org/jive/message.jspa?messageID=318009







On Fri, Feb 13, 2009 at 11:09 PM, Richard Elling
 wrote:
> Tim wrote:
>>
>>
>> On Fri, Feb 13, 2009 at 4:21 PM, Bob Friesenhahn
>> mailto:bfrie...@simple.dallas.tx.us>> wrote:
>>
>>On Fri, 13 Feb 2009, Ross Smith wrote:
>>
>>However, I've just had another idea.  Since the uberblocks are
>>pretty
>>vital in recovering a pool, and I believe it's a fair bit of
>>work to
>>search the disk to find them.  Might it be a good idea to
>>allow ZFS to
>>store uberblock locations elsewhere for recovery purposes?
>>
>>
>>Perhaps it is best to leave decisions on these issues to the ZFS
>>designers who know how things work.
>>
>>Previous descriptions from people who do know how things work
>>didn't make it sound very difficult to find the last 20
>>uberblocks.  It sounded like they were at known points for any
>>given pool.
>>
>>Those folks have surely tired of this discussion by now and are
>>working on actual code rather than reading idle discussion between
>>several people who don't know the details of how things work.
>>
>>
>>
>> People who "don't know how things work" often aren't tied down by the
>> baggage of knowing how things work.  Which leads to creative solutions those
>> who are weighed down didn't think of.  I don't think it hurts in the least
>> to throw out some ideas.  If they aren't valid, it's not hard to ignore them
>> and move on.  It surely isn't a waste of anyone's time to spend 5 minutes
>> reading a response and weighing if the idea is valid or not.
>
> OTOH, anyone who followed this discussion the last few times, has looked
> at the on-disk format documents, or reviewed the source code would know
> that the uberblocks are kept in an 128-entry circular queue which is 4x
> redundant with 2 copies each at the beginning and end of the vdev.
> Other metadata, by default, is 2x redundant and spatially diverse.
>
> Clearly, the failure mode being hashed out here has resulted in the defeat
> of those protections. The only real question is how fast Jeff can roll out
> the
> feature to allow reverting to previous uberblocks.  The procedure for doing
> this by hand has long been known, and was posted on this forum -- though
> it is te

Re: [zfs-discuss] ZFS on SAN?

2009-02-14 Thread Miles Nordin
> "as" == Andras Spitzer  writes:

as> So, you telling me that even if the SAN provides redundancy
as> (HW RAID5 or RAID1), people still configure ZFS with either
as> raidz or mirror?

There's some experience that, in the case where the storage device or
the FC mesh glitches or reboots while the ZFS host stays up across the
reboot, you are less likely to lose the whole pool to ``ZFS-8000-72
The pool metadata is corrupted and cannot be opened. Destroy the pool
and restore from backup.'' if you have ZFS-level redundancy than if
you don't.

Note that this ``corrupt and cannot be opened'' is a different problem
from ``not being able to self-heal.''  When you need self-healing and
don't have it, you usually shouldn't lose the whole pool.  You should
get a message in 'zpool status' telling you the name of a file that
has unrecoverable errors.  Any attempt to read the file returns an I/O
error (not the marginal data).  Then you have to go delete that file
to clear the error, but otherwise the pool keeps working.  In this
self-heal case, if you'd had the ZFS-layer redundancy you'd get a
count in the checksum column of one device and wouldn't have to delete
the file, in fact you wouldn't even know the name of the file that got
healed.

some people have been trying to blame the ``corrupt and cannot be
opened'' on bit-flips supposedly happening inside the storage or the
FC cloud, the same kind of bit flip that causes the other
self-healable problem, but I don't buy it.  I think it's probably
cache sync / write barrier problems that's killing the unredundant
pools on SAN's.


pgpYVvSe908RY.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss