3ware cards [Was: Re: A little story of failed raid5 (3ware 8000 series)]

2007-08-28 Thread Darren Pilgrim
Gary Palmer wrote: Darren Pilgrim wrote: Tom Judge wrote: If you use the 3dm2 management interface you can schedule verify and rebuild tasks to run on a regular basis. I think that 7500 series controllers can do this, 9500 and 9550's definitely can. >> Actually it's all 7/8/9xxx series cards.

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-28 Thread Gary Palmer
On Sun, Aug 26, 2007 at 11:38:19AM -0700, Darren Pilgrim wrote: > Tom Judge wrote: > >Tom Samplonius wrote: > >>The real solution is RAID scrubbing: a low level background process > >>that reads every sector of every disk. All of the real RAID systems > >>do this (usually scheduled weekly, or eve

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-26 Thread Darren Pilgrim
Tom Judge wrote: Tom Samplonius wrote: The real solution is RAID scrubbing: a low level background process that reads every sector of every disk. All of the real RAID systems do this (usually scheduled weekly, or every other week). Most 3ware RAID card don't have this feature. So rather than

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-25 Thread Tom Samplonius
- "David Schwartz" <[EMAIL PROTECTED]> wrote: > > It is supposed to be > > for detecting data corruption, so if the card isn't using the > > checksum, its kinda of useless. > > You are confused. Checking for data corruption is done, by checking if > the *DATA* is corrupt. This does not req

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-25 Thread Manjunath R Gowda
On 8/25/07, Tom Samplonius <[EMAIL PROTECTED]> wrote: > > > - "Artem Kuchin" <[EMAIL PROTECTED]> wrote: > ... > > But i don't understand how and why it happened. ONly 6 hours ago (a > > night before) > > all those files were backed up fine w/o any read error. And now, right > > after replacing

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-25 Thread Andrei Kolu
Friday 24 August 2007 23:04:37 kirjutas Matthew Dillon: >A friend of mine once told me that the only worthwhile RAID systems are >the ones that email you a detailed message when something goes south. > > -Matt > _

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-25 Thread Tom Judge
Tom Samplonius wrote: - "Artem Kuchin" <[EMAIL PROTECTED]> wrote: ... But i don't understand how and why it happened. ONly 6 hours ago (a night before) all those files were backed up fine w/o any read error. And now, right after replacing the driver and starting rebuild it said that there a

RE: A little story of failed raid5 (3ware 8000 series)

2007-08-25 Thread David Schwartz
> This isn't really accurate. First of all, if the RAID > controller isn't confirming checksums before giving the data to > the OS, what is the checksum for exactly? The checksum is used to recover the data in the event one piece of the data is lost. With all of the data but one piece, and

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-25 Thread Tom Samplonius
- "Martin Nilsson" <[EMAIL PROTECTED]> wrote: > That is what patrol read is intended to detect before it is a problem. > > In a RAID5 array the checksums are only used when reconstructing data, > > if you have a bad block in a checksum sector it will not be detected > until a drive have fa

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-25 Thread Tom Samplonius
- "Artem Kuchin" <[EMAIL PROTECTED]> wrote: ... > But i don't understand how and why it happened. ONly 6 hours ago (a > night before) > all those files were backed up fine w/o any read error. And now, right > after replacing > the driver and starting rebuild it said that there are bad sectors

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-24 Thread Matthew Dillon
A friend of mine once told me that the only worthwhile RAID systems are the ones that email you a detailed message when something goes south. -Matt ___ freebsd-stable@freebsd.org mailing list http://l

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-24 Thread Clayton Milos
On August 24, 2007 02:31 am Clayton Milos wrote: > On Tue, 21 Aug 2007 08:57:22 +0400 > > "Artem Kuchin" <[EMAIL PROTECTED]> wrote: >> Um.. it is because i did not have a map of hot swap baskets to >> conroller ports and i needed to check every driver basket to >> understand which port it sits on

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-24 Thread Elias Hartvigson
Är precis nyinflyttad i Göteborg, så jag vet inte riktigt var Linnéstaden ligger, själv bor jag vid guldheden ett stenkast från wavrinsky platsen. Förmiddagen passar mig bäst också då jag är upptagen på em :) On 8/24/07, Scott Long <[EMAIL PROTECTED]> wrote: > > Feargal Reilly wrote: > > On Tue, 2

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-24 Thread Scott Long
Feargal Reilly wrote: On Tue, 21 Aug 2007 08:57:22 +0400 "Artem Kuchin" <[EMAIL PROTECTED]> wrote: Um.. it is because i did not have a map of hot swap baskets to conroller ports and i needed to check every driver basket to understand which port it sits on. I have no choise, i think. I'm just

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-24 Thread Freddie Cash
On August 24, 2007 02:31 am Clayton Milos wrote: > > On Tue, 21 Aug 2007 08:57:22 +0400 > > > > "Artem Kuchin" <[EMAIL PROTECTED]> wrote: > >> Um.. it is because i did not have a map of hot swap baskets to > >> conroller ports and i needed to check every driver basket to > >> understand which port

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-24 Thread Clayton Milos
On Tue, 21 Aug 2007 08:57:22 +0400 "Artem Kuchin" <[EMAIL PROTECTED]> wrote: Um.. it is because i did not have a map of hot swap baskets to conroller ports and i needed to check every driver basket to understand which port it sits on. I have no choise, i think. I'm just going to highlight

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-24 Thread Clayton Milos
On Tue, 21 Aug 2007 08:57:22 +0400 "Artem Kuchin" <[EMAIL PROTECTED]> wrote: Um.. it is because i did not have a map of hot swap baskets to conroller ports and i needed to check every driver basket to understand which port it sits on. I have no choise, i think. I'm just going to highlight

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-24 Thread Feargal Reilly
On Tue, 21 Aug 2007 08:57:22 +0400 "Artem Kuchin" <[EMAIL PROTECTED]> wrote: > Um.. it is because i did not have a map of hot swap baskets to > conroller ports and i needed to check every driver basket to > understand which port it sits on. I have no choise, i think. > I'm just going to highligh

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-21 Thread Darren Pilgrim
Artem Kuchin wrote: Darren Pilgrim wrote: Artem Kuchin wrote: That exactly was i was talking about. I don't acess to individual disks behind raid unit, so, i cannot doit. I don't know it controller VERIFY command does it right. If it doesm then i shoudl put it into a cron job and do it on we

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-21 Thread Daniel O'Connor
On Tue, 21 Aug 2007, Artem Kuchin wrote: > Now, what i don't understand is why Hardware_ECC_Recovered and > Seek_Error_Rate are so hight. The first one is maybe relate > to cabling problem. The driver are all in hot swap baskets of > supermicro 2u case. Maybe backpanel is no so good? > >

RE: A little story of failed raid5 (3ware 8000 series)

2007-08-21 Thread David Schwartz
> While we are on the subject: > > What is the practical difference between VERIFY and REBUILD with regards > to a RAID-5 array? Verify should at a minimum read all the data. Ideally, it would read the checksum blocks too to make sure they are still valid, but it might not. Rebuild should read a

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-21 Thread Artem Kuchin
You can run smartmontools on disks behind 3ware controllers, eg /dev/twe0 -d 3ware,0 -a -o on -S on -m [EMAIL PROTECTED] /dev/twe0 -d 3ware,1 -a -o on -S on -m [EMAIL PROTECTED] did this: smartctl /dev/twe0 -d 3ware,1 -a for each driver on another server. Two driver are pretty old, the driver

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-21 Thread Artem Kuchin
Darren Pilgrim wrote: Artem Kuchin wrote: That exactly was i was talking about. I don't acess to individual disks behind raid unit, so, i cannot doit. I don't know it controller VERIFY command does it right. If it doesm then i shoudl put it into a cron job and do it on weekly basis. Also, it

RE: A little story of failed raid5 (3ware 8000 series)

2007-08-20 Thread Daniel Eriksson
While we are on the subject: What is the practical difference between VERIFY and REBUILD with regards to a RAID-5 array? My Highpoint RocketRAID 2320 and 2340 cards can be scheduled to perform either verify or rebuild. I currently have them set to verify the arrays weekly. Is that reasonably ofte

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-20 Thread Rob MacGregor
Artem Kuchin unleashed the infinite monkeys on 20/08/2007 23:38 producing: <---SNIP---> > But i don't understand how and why it happened. ONly 6 hours ago (a > night before) > all those files were backed up fine w/o any read error. And now, right > after replacing > the driver and starting rebuild

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-20 Thread Darren Pilgrim
Artem Kuchin wrote: That exactly was i was talking about. I don't acess to individual disks behind raid unit, so, i cannot doit. I don't know it controller VERIFY command does it right. If it doesm then i shoudl put it into a cron job and do it on weekly basis. Also, it would halpfull it i coul

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-20 Thread Daniel O'Connor
On Tue, 21 Aug 2007, Artem Kuchin wrote: > could get access to number of left reserved sector for remapping. Any > idea about these two for 3ware controllers? Also, someone should > mention, that while using raid MUST do verifies often. You can run smartmontools on disks behind 3ware controllers,

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-20 Thread Artem Kuchin
Martin Nilsson wrote: Artem Kuchin wrote: But i don't understand how and why it happened. ONly 6 hours ago (a night before) all those files were backed up fine w/o any read error. And now, right after replacing the driver and starting rebuild it said that there are bad sectors all over those fil

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-20 Thread Martin Nilsson
Artem Kuchin wrote: But i don't understand how and why it happened. ONly 6 hours ago (a night before) all those files were backed up fine w/o any read error. And now, right after replacing the driver and starting rebuild it said that there are bad sectors all over those file. How come? That

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-20 Thread Artem Kuchin
David Schwartz wrote: A day ago at 11 am i have turn off the server, pull out the old driver, installed a new one, turned of the server and started rebuild in an hour from remote location via web interface. After about 5 minuted the machine became unresponsive. Tried rebooting - nothing. I went t

RE: A little story of failed raid5 (3ware 8000 series)

2007-08-20 Thread David Schwartz
> A day ago at 11 am i have turn off the server, > pull out the old driver, installed a new one, turned of the server > and started rebuild in an hour from remote location via web interface. > After about 5 minuted the machine became unresponsive. Tried rebooting > - nothing. I went to the machine

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-20 Thread Jim Pingle
Artem Kuchin wrote: > A day ago at 11 am i have turn off the server, > pull out the old driver, installed a new one, turned of the server > and started rebuild in an hour from remote location via web interface. > After about 5 minuted the machine became unresponsive. Tried rebooting > - nothing. I

Re: A little story of failed raid5 (3ware 8000 series)

2007-08-20 Thread Scott Long
Artem Kuchin wrote: So, no raid5 or even raid 6 for me any more. Never! A better policy is to invest in a higher quality RAID controller. Also, always use a battery backup on the controller, and always have an extra disk configured as a hot spare. Data integrity is expensive, unfortunately.