Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-31 Thread Marcelo Leal
Thanks a lot Sanjeev!
 If you look my first message you will see that discrepancy in zdb...

 Leal.
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-31 Thread Sanjeev
Marcelo,

On Wed, Dec 31, 2008 at 02:17:37AM -0800, Marcelo Leal wrote:
> Thanks a lot Sanjeev!
>  If you look my first message you will see that discrepancy in zdb...

Apologies. Now, in the hindsight I understand why you gave the zdb details :-(
I should have read the mail carefully.

Thanks and regards,
Sanjeev.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2008-12-31 Thread Marc Bevand
Mattias Pantzare  gmail.com> writes:
> 
> He was talking about errors that the disk can't detect (errors
> introduced by other parts of the system, writes to the wrong sector or
> very bad luck). You can simulate that by writing diffrent data to the
> sector,

Well yes you can. Carsten and I are both talking about silent data corruption 
errors, and the way to simulate them is to do what Carsten did. However I 
pointed out that he may have tested only easy corruption cases (affecting the 
P or Q parity only) -- it is tricky to simulate hard-to-recover corruption 
errors...

-marc

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] read/write errors on storage pool (poss. ahci/hw related?)

2008-12-31 Thread Jay
hi richard,

the bugs database ... figures ... now that you said it, it's really 
quite obvious :)

thanks, and thanks for the hint towards the drivers-discuss forum.

bye,
jay
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2008-12-31 Thread Orvar Korvar
Ive studied all links here. But I want information of the HW raid controllers. 
Not about ZFS, because I have plenty of ZFS information now. The closest thing 
I got was
www.baarf.org
Where in one article he states that "raid5 never does parity check on reads". 
Ive wrote that to the Linux guys. And also "raid6 guesses when it tries to 
repair some errors with a chance of corrupting more". Thats hard facts. 

Anymore?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Redundancy for /opt by raidz?

2008-12-31 Thread Vincent Fox
>Dear Admin
>u said:
>choose a 3-disk RAID-1 or a 4-disk RAID10
>set for tank over RAIDZ aWith a 3-disk mirror you'd have a disk left over
>to be hot spare for failure in either rpool or tank.
>why do u prefer raid1 with one spare rather than raidz?
>i heard raidz has better redundancy than raid 1 or 5?!
>doesnt raidz support failover when one of my disk crach?
>Regards[/i]

A RAIDZ set would allow 1 disk failure before it
becomes non-redundant and if a 2nd disk fails you
would lose all data.  RAIDZ is similar to RAID5 in that
it only takes a double-disk failure to lose everything.
Of course you could have RAIDZ2 which is akin to RAID6.

A 3-disk mirror set with 2 disks failing you would
still have data.

You need to read more on RAID terminology.  Twice in
my career I have had 2 disks fail in a RAID5 set before
a hot spare could be added and all data was lost.

Perfect paranoia is perfect awareness.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2008-12-31 Thread Richard Elling
Orvar Korvar wrote:
> Ive studied all links here. But I want information of the HW raid 
> controllers. Not about ZFS, because I have plenty of ZFS information now. The 
> closest thing I got was
> www.baarf.org
>   

[one of my favorite sites ;-)]
The problem is that there is no such thing as "hardware RAID" there is
only "software RAID."  The "HW RAID" controllers are processors
running software and the features of the product are therefore limited by
the software developer and processor capabilities.  I goes without saying
that the processors are very limited, compared to the main system CPU
found on modern machines.  It also goes without saying that the software
(or firmware, if you prefer) is closed.  Good luck cracking that nut.

> Where in one article he states that "raid5 never does parity check on reads". 
> Ive wrote that to the Linux guys. And also "raid6 guesses when it tries to 
> repair some errors with a chance of corrupting more". Thats hard facts. 
>   

The high-end RAID arrays have better, more expensive processors and
a larger feature set. Some even add block-level checksumming, which has
led to some fascinating studies on field failures.  But I think it is 
safe to
assume that those features will not exist on the low-end systems for some
time.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2008-12-31 Thread Dave Brown
There is a company (DataCore Software) that has been making / shipping 
products for many years that I believe would help in this area.  I've 
used them before, they're very solid and have been leveraging the use of 
commodity server and disk hardware to build massive storage arrays (FC & 
iSCSI), one of the same things ZFS is working to do.  I looked at some 
of the documentation for this topic of discussion and this is what I found:

CRC/Checksum Error Detection
In SANmelody and SANsymphony, enhanced error detection can be provided 
by enabling Cyclic Redundancy Check (CRC), a form of sophisticated 
redundancy check. When CRC/Checksum is enabled, the iSCSI driver adds a 
bit scheme to the iSCSI packet when it is transmitted. The iSCSI driver 
then verifies the bits in the packet when it is received to ensure data 
integrity. This error detection method provides a low probability of 
undetected errors compared to standard error checking performed by 
TCP/IP. The CRC bits may be added to either Data Digest, Header Digest, 
or both.

DataCore has been really good at implementing all the features of the 
'high end' arrays for the 'low end' price point.

Dave


Richard Elling wrote:
> Orvar Korvar wrote:
>   
>> Ive studied all links here. But I want information of the HW raid 
>> controllers. Not about ZFS, because I have plenty of ZFS information now. 
>> The closest thing I got was
>> www.baarf.org
>>   
>> 
>
> [one of my favorite sites ;-)]
> The problem is that there is no such thing as "hardware RAID" there is
> only "software RAID."  The "HW RAID" controllers are processors
> running software and the features of the product are therefore limited by
> the software developer and processor capabilities.  I goes without saying
> that the processors are very limited, compared to the main system CPU
> found on modern machines.  It also goes without saying that the software
> (or firmware, if you prefer) is closed.  Good luck cracking that nut.
>
>   
>> Where in one article he states that "raid5 never does parity check on 
>> reads". Ive wrote that to the Linux guys. And also "raid6 guesses when it 
>> tries to repair some errors with a chance of corrupting more". Thats hard 
>> facts. 
>>   
>> 
>
> The high-end RAID arrays have better, more expensive processors and
> a larger feature set. Some even add block-level checksumming, which has
> led to some fascinating studies on field failures.  But I think it is 
> safe to
> assume that those features will not exist on the low-end systems for some
> time.
>  -- richard
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>
>   
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2008-12-31 Thread Miles Nordin
> "ca" == Carsten Aulbert  writes:
> "ok" == Orvar Korvar  writes:

ca> (using hdparm's utility makebadsector)

I haven't used that before, but it sounds like what you did may give
the RAID layer some extra information.  If one of the disks reports
``read error---I have no idea what's stored in that sector,'' then
RAID5/6 knows which disk is wrong because the disk confessed.  If all
the disks successfully return data, but one returns the wrong data,
RAID5/6 has to determine the wrong disk by math, not by device driver
error returns.

I don't think RAID6 reads whole stripes, so even if the dual parity
has some theoretical/implemented ability to heal single-disk silent
corruption, it'd do this healing only during some scrub-like
procedure, not during normal read.  The benefit is better seek
bandwidth than raidz.  If the corruption is not silent (the disk
returns an error) then it could use the hypothetical magical
single-disk healing ability during normal read too.

ca> powered it up and ran a volume check and the controller did
ca> indeed find the corrupted sector

sooo... (1) make your corrupt sector with dd rather than hdparm, like
dd if=/dev/zero of=/dev/disk bs=512 count=1 seek=12345 conv=notrunc,
and (2) check for the corrupt sector by reading the disk
normally---either make sure the corrupt sector is inside a checksummed
file like a tar or gz and use tar t or gzip -t, or use dd
if=/dev/raidvol | md5sum before and after corrupting, something like
that, NOT a ``volume check''.  Make both 1, 2 changes and I think the
corruption will get through.  Make only the first change but not the
second, and you can look for this hypothetical math-based healing
ability you're saying RAID6 has from having more parity than it needs
for the situation.

ok> "upon writing data to disc, ZFS reads it back and compares to
ok> the data in RAM and corrects it otherwise".

I don't think it does read-after-write.  That'd be really slow.

The thing I don't like about the checksums is that they trigger for
things other than bad disks, like if your machine loses power during a
resilver, or other corner cases and bugs.  I think the Netapp
block-level RAID-layer checksums don't trigger for as many other
reasons as the ZFS filesystem-level checksums, so chasing problems is
easier.

The good thing is that they are probably helping survive the corner
cases and bugs, too.


pgpn4n8gOCDNs.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2008-12-31 Thread Miles Nordin
> "db" == Dave Brown  writes:

db> CRC/Checksum Error Detection In SANmelody and SANsymphony,
db> enhanced error detection can be provided by enabling Cyclic
db> Redundancy Check (CRC) [...] The CRC bits may
db> be added to either Data Digest, Header Digest, or both.

Thanks for the plug, but that sounds like an iSCSI feature, between
storage controller and client, not between storage controller and
disk.  It sounds suspiciously like they're advertising something many
vendors do without bragging, but I'm not sure.  Anyway we're talking
about something different: writing to the disk in checksummed packets,
so the storage controller can tell when the disk has silently returned
bad data or another system has written to part of the disk, stuff like
that---checksums to protect data as time passes, not as it travels
through space.


pgpSKNH303Ncw.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2008-12-31 Thread Tim
On Wed, Dec 31, 2008 at 12:58 PM, Miles Nordin  wrote:

> > "db" == Dave Brown  writes:
>
>db> CRC/Checksum Error Detection In SANmelody and SANsymphony,
>db> enhanced error detection can be provided by enabling Cyclic
>db> Redundancy Check (CRC) [...] The CRC bits may
>db> be added to either Data Digest, Header Digest, or both.
>
> Thanks for the plug, but that sounds like an iSCSI feature, between
> storage controller and client, not between storage controller and
> disk.  It sounds suspiciously like they're advertising something many
> vendors do without bragging, but I'm not sure.  Anyway we're talking
> about something different: writing to the disk in checksummed packets,
> so the storage controller can tell when the disk has silently returned
> bad data or another system has written to part of the disk, stuff like
> that---checksums to protect data as time passes, not as it travels
> through space.
>

The CRC checking is at least standard on QLogic hardware HBA's.  I would
imagine most vendors have it in their software stacks as well since it's
part of the iSCSI standard.  It was more of a corner case for iSCSI to try
to say "look, I'm as good as Fibre Channel" than anything else (IMO).
Although that opinion may very well be inaccurate :)


--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2008-12-31 Thread JZ
"The problem is that there is no such thing as "hardware RAID" there is
only "software RAID."  The "HW RAID" controllers are processors
running software and the features of the product are therefore limited by
the software developer and processor capabilities.  I goes without saying
that the processors are very limited, compared to the main system CPU
found on modern machines.  It also goes without saying that the software
(or firmware, if you prefer) is closed.  Good luck cracking that nut." --  
Richard

Yes, thx!
And beyond that, there are HW RAID adapters and HW RAID chips embedded into 
disk enclosures, they are all HW RAID ASICs with closed software, not very 
Open Storage.   ;-)

best,
z



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2008-12-31 Thread JZ
Happy new year!
Snowing here and my new year party was cancelled. Ok, let me do more boring IT 
stuff then.

Orvar, sorry I misunderstood you.
Please feel free to explore the limitations of hardware RAID, and hopefully one 
day you will come to a conclusion that -- it was invented for saving CPU juice 
from disk management in order to better fulfill the application needs, and that 
fundamental driver is weakening day by day.  

NetApp argued that with today's CPU power and server technologies, software 
RAID can be as efficient and even better if it is done right. And Datacore went 
beyond NetApp by enabling a software delivery to customers, instead of an 
integrated platform...

Anyway, if you are still into checking out HW RAID capabilities, I would 
suggest to do that in a categorized fashion. As you can see, there are many 
many RAID cards at very very different price points. It is not fair to make a 
statement that covers all of them. (and I can go to china tomorrow and burn any 
firmware into a RAID ASIC and challenge that statement...) Hence your request 
was a bit too difficult -- if you tell the list which HW RAID adapter you are 
focusing on, I am sure the list will knock that one off in no time.   ;-)
http://www.ciao.com/Sun_StorageTek_SAS_RAID_Host_Bus_Adapter__15537063

Best,
z, bored

  - Original Message - 
  From: Tim 
  To: Miles Nordin 
  Cc: zfs-discuss@opensolaris.org 
  Sent: Wednesday, December 31, 2008 3:20 PM
  Subject: Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?





  On Wed, Dec 31, 2008 at 12:58 PM, Miles Nordin  wrote:

> "db" == Dave Brown  writes:

   db> CRC/Checksum Error Detection In SANmelody and SANsymphony,
   db> enhanced error detection can be provided by enabling Cyclic
   db> Redundancy Check (CRC) [...] The CRC bits may
   db> be added to either Data Digest, Header Digest, or both.

Thanks for the plug, but that sounds like an iSCSI feature, between
storage controller and client, not between storage controller and
disk.  It sounds suspiciously like they're advertising something many
vendors do without bragging, but I'm not sure.  Anyway we're talking
about something different: writing to the disk in checksummed packets,
so the storage controller can tell when the disk has silently returned
bad data or another system has written to part of the disk, stuff like
that---checksums to protect data as time passes, not as it travels
through space.


  The CRC checking is at least standard on QLogic hardware HBA's.  I would 
imagine most vendors have it in their software stacks as well since it's part 
of the iSCSI standard.  It was more of a corner case for iSCSI to try to say 
"look, I'm as good as Fibre Channel" than anything else (IMO).  Although that 
opinion may very well be inaccurate :)  


  --Tim



--


  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss