Re: [Bacula-users] Bacula and 16 bay JBOD

Mehma Sarja Fri, 18 Mar 2011 18:04:38 -0700

On 3/18/11 4:41 PM, Marcello Romani wrote:
> Il 18/03/2011 19:01, Mehma Sarja ha scritto:
>> On 3/17/11 4:57 PM, Phil Stracchino wrote:
>>> On 03/17/11 18:46, Marcello Romani wrote:
>>>> Il 16/03/2011 18:38, Phil Stracchino ha scritto:
>>>>> On 03/16/11 13:08, Mike Hobbs wrote:
>>>>>>       Hello,  I'm currently testing bacula v5.0.3 and so far so good.  
>>>>>> One
>>>>>> of my issues though, I have a 16 bay Promise Technologies VessJBOD.  How
>>>>>> do I get bacula to use all the disks for writing volumes to?
>>>>>>
>>>>>> I guess the way I envision it working would be, 50gb volumes would be
>>>>>> used and when disk1 fills up, bacula switches over to disk2 and starts
>>>>>> writing out volumes until that disk is filled, then on to disk3, etc..
>>>>>> eventually coming back around and recycling the volumes on disk 1.
>>>>>>
>>>>>> I'm not sure the above scenario is the best way to go about this, I've
>>>>>> read that some people create a "pool" for each drive.  What is the most
>>>>>> common practice when setting up a JBOD unit with bacula?  Any
>>>>>> suggestions or advice would be appropriated.
>>>>> That scheme sounds like a bad and overly complex idea, honestly.
>>>>> Depending on your data load, I'd use software RAID to make them into a
>>>>> single RAID5 or RAID10 volume.  RAID10 would be faster and, if set up
>>>>> correctly[1], more redundant; RAID5 is more space-efficient, but slower.
>>>>>
>>>>>
>>>>> [1] There's a right and a wrong way to set up RAID10.  The wrong way is
>>>>> to set up two five-disk stripes, then mirror them; lose one disk from
>>>>> each stripe, and you're dead in the water.  The right way is to set up
>>>>> five mirrored pairs, then stripe the pairs; this will survive multiple
>>>>> disk failures as long as you don't lose both disks of any single pair.
>>>>>
>>>>>
>>>> Hi Phil,
>>>>         that last sentence sounds a little scary to me: "this will survive
>>>> multiple disk failures *as long as you don't lose both disks of any
>>>> single pair*".
>>>> Isn't RAID6 a safer bet ?
>>> That depends.
>>>
>>> With RAID6, you can survive any one or two disk failures, in degraded
>>> mode.  You'll have a larger working set than RAID10, but performance
>>> will be slower because of the overhead of parity calculations.  A third
>>> failure will bring the array down and you will lose the data.
>>>
>>> With RAID10 with sixteen drives, you can survive any one drive failure
>>> with minimal performance degradation.  There is a 1 in 15 chance that a
>>> second failure will be the other drive of that pair, and bring the array
>>> down.  If not, then there is a 1 in 7 chance that a third drive failure
>>> will be on the same pair as one of the two drives already failed.  If
>>> not, the array will still continue to operate, with some read
>>> performance degradation, and there is now a just less than 1 in 4 chance
>>> (3/13) that if a fourth drive fails, it will be on the same pair as one
>>> of the three already failed.  ... And so on.  There is a cumulative 39%
>>> chance that four random failures will fail the entire array, which rises
>>> to 59% with five failures, and 78% with six.  (91% at seven, 98% at
>>> eight, and no matter how many leprechauns live in your back yard, at
>>> nine failures you're screwed of course.  It's like the joke about the
>>> two men in the airliner.)
>>>
>>> But if the array was RAID6, it already went down for the count when the
>>> third drive failed.
>>>
>>>
>>>
>>> Now, granted, multiple failures like that are rare.  But ... I had a
>>> cascade failure of three drives out of a twelve-drive RAIDZ2 array
>>> between 4am and 8am one morning.  Each drive that failed pushed the load
>>> on the remaining drives higher, and after a couple of hours of that, the
>>> next weakest drive failed, which pushed the load still higher.  And when
>>> the third drive failed, the entire array went down.  It can happen.
>>>
>>> But ...  I'm running RAIDZ3 right now, and as soon as I can replace the
>>> rest of the drives with new drives, I'll be going back to RAIDZ2.
>>> Because RAIDZ3 is a bit too much of a performance hit on my server, and
>>> - with drives that aren't dying of old age - RAIDZ2 is redundant
>>> *enough* for me.  There is no data on the array that is crucial *AND*
>>> irreplaceable *AND* not also stored somewhere else.
>>>
>>> What it comes down to is, you have to decide for yourself what your
>>> priorities are - redundancy, performance, space efficiency - and how
>>> much of each you're willing to give up to get as much as you want of the
>>> others.
>>>
>>>
>> There is one more thing to think about and that is cumulative aging.
>> Starting with all new disks is a false sense of security because as they
>> age, and if they are in any sort of RAID/performance configuration, they
>> will age and wear evenly. Which means they will all start to fail
>> together. It is OK to design a system and assume one or two simultaneous
>> drive failure - when the drives are relatively young. After 3 years of
>> sustained use, like email storage, you are at higher risk no matter
>> which RAID scheme you have used.
>>
>> Mehma
> This is an interesting point. But what parameter should one take into
> account to decide when it's time to replace an aged (but still good)
> disk with a fresh one ?
>
I can only think of staggering drive age and maintenance. Here's hoping 
that someone on the list can come up with more creative solutions/practices.


Mehma

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Bacula and 16 bay JBOD

Reply via email to