On Thu, 18 Feb 2010, Adam Leventhal wrote:
It is unreasonable to spend more than 24 hours to resilver a single drive.
Why?
Human factors. People usually go to work once per day so it makes
sense that they should be able to perform at least once maintenance
action per day. Ideally the re
Hey Bob,
> My own conclusions (supported by Adam Leventhal's excellent paper) are that
>
> - maximum device size should be constrained based on its time to
> resilver.
>
> - devices are growing too large and it is about time to transition to
> the next smaller physical size.
I don't disagre
> "dc" == Daniel Carosone writes:
dc> single-disk laptops are a pretty common use-case.
It does not help this case.
It helps the case where a single laptop disk fails and you recover it
with dd conv=noerror,sync. This case is uncommon because few people
know how to do it, or bother.
Well said.
-original message-
Subject: Re: [zfs-discuss] Proposed idea for enhancement - damage control
From: Bob Friesenhahn
Date: 02/17/2010 11:10
On Wed, 17 Feb 2010, Marty Scholes wrote:
>
> Bob, the vast majority of your post I agree with. At the same time, I might
> disagr
Dan,
Exactly what I meant. An allocation policy, that will help in distributing the
data in a way that when one disk is lost (entire mirror) than some data remains
fully accessible as opposed to not been able to access pieces all over the
storage pool.
--
This message posted from opensolaris.o
On Wed, Feb 17, 2010 at 02:38:04PM -0500, Miles Nordin wrote:
> copies=2 has proven to be mostly useless in practice.
I disagree. Perhaps my cases fit under the weasel-word "mostly", but
single-disk laptops are a pretty common use-case.
> If there were a real-world device that tended to randomly
>
>
>>If there were a real-world device that tended to randomly flip bits,
>>or randomly replace swaths of LBA's with zeroes, but otherwise behave
>>normally (not return any errors, not slow down retrying reads, not
>>fail to attach), then copies=2 would be really valuable, but so far it
>>seems n
>If there were a real-world device that tended to randomly flip bits,
>or randomly replace swaths of LBA's with zeroes, but otherwise behave
>normally (not return any errors, not slow down retrying reads, not
>fail to attach), then copies=2 would be really valuable, but so far it
>seems no such d
On Feb 17, 2010, at 12:34, Richard Elling wrote:
I'm not sure how to connect those into the system (USB 3?), but when
you build it, let us
know how it works out.
FireWire 3200 preferably. Anyone know if USB 3 sucks as much CPU as
previous versions?
If I'm burning CPU on I/O I'd rather ha
On 02/17/10 02:38 PM, Miles Nordin wrote:
copies=2 has proven to be mostly useless in practice.
Not true. Take an ancient PC with a mirrored root pool, no
bus error checking and non-ECC memory, that flawlessly
passes every known diagnostic (SMC included).
Reboot with copies=1 and the same fil
> "ck" == Christo Kutrovsky writes:
ck> I could always put "copies=2" (or more) to my important
ck> datasets and take some risk and tolerate such a failure.
copies=2 has proven to be mostly useless in practice.
If there were a real-world device that tended to randomly flip bits,
or
On Wed, 17 Feb 2010, Marty Scholes wrote:
Bob, the vast majority of your post I agree with. At the same time, I might
disagree with a couple of things.
I don't really care how long a resilver takes (hours, days, months) given a
couple things:
* Sufficient protection exists on the degraded ar
I cant' stop myself; I have to respond. :-)
Richard wrote:
> The ideal pool has one inexpensive, fast, and reliable device :-)
My ideal pool has become one inexpensive, fast and reliable "device" built on
whatever I choose.
> I'm not sure how to connect those into the system (USB 3?)
Me neith
On Feb 17, 2010, at 9:09 AM, Marty Scholes wrote:
> Bob Friesenhahn wrote:
>> It is unreasonable to spend more than 24 hours to resilver a single
>> drive. It is unreasonable to spend more than 6 days resilvering all
>> of the devices in a RAID group (the 7th day is reserved for the system
>> adm
Bob Friesenhahn wrote:
> It is unreasonable to spend more than 24 hours to resilver a single
> drive. It is unreasonable to spend more than 6 days resilvering all
> of the devices in a RAID group (the 7th day is reserved for the system
> administrator). It is unreasonable to spend very much time
On Wed, 17 Feb 2010, Daniel Carosone wrote:
These small numbers just tell you to be more worried about defending
against the other stuff.
Let's not forget that the most common cause of data loss is human
error!
Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.or
Dan,
"loose" was a typo. I meant "lose". Interesting how a typo (write error) can
cause a lot of confusion on what exactly I mean :) Resulting in corrupted
interpretation.
Note that my idea/proposal is targeted for a growing number of home users. To
those, value for money usually is a much mo
Bob,
Using a separate pool would dictate other limitations, such as not been able to
use more space than what's allocated in the pool. You could "add" space as
needed, but you can't remove (move) devices freely.
By using a shared pool with a hint of desired vdev/space allocation policy, you
co
On Tue, Feb 16, 2010 at 04:47:11PM -0800, Christo Kutrovsky wrote:
> One of the ideas that sparkled is have a "max devices" property for
> each data set, and limit how many mirrored devices a given data set
> can be spread on. I mean if you don't need the performance, you can
> limit (minimize) the
On Tue, Feb 16, 2010 at 06:28:05PM -0800, Richard Elling wrote:
> The problem is that MTBF measurements are only one part of the picture.
> Murphy's Law says something will go wrong, so also plan on backups.
+n
> > Imagine this scenario:
> > You lost 2 disks, and unfortunately you lost the 2 side
On Feb 16, 2010, at 4:47 PM, Christo Kutrovsky wrote:
> Just finished reading the following excellent post:
>
> http://queue.acm.org/detail.cfm?id=1670144
>
> And started thinking what would be the best long term setup for a home
> server, given limited number of disk slots (say 10).
>
> I cons
On Tue, 16 Feb 2010, Christo Kutrovsky wrote:
The goal was to do "damage control" in a disk failure scenario
involving data loss. Back to the original question/idea.
Which would you prefer, loose a couple of datasets, or loose a
little bit of every file in every dataset.
This ignores the f
On Tue, 16 Feb 2010, Christo Kutrovsky wrote:
Just finished reading the following excellent post:
http://queue.acm.org/detail.cfm?id=1670144
A nice article, even if I don't agree with all of its surmises and
conclusions. :-)
In fact, I would reach a different conclusion.
I considered some
Thanks for your feedback James, but that's not the direction where I wanted
this discussion to go.
The goal was not how to create a better solution for an enterprise.
The goal was to do "damage control" in a disk failure scenario involving data
loss. Back to the original question/idea.
Which
On Tue, Feb 16, 2010 at 6:47 PM, Christo Kutrovsky wrote:
> Just finished reading the following excellent post:
>
> http://queue.acm.org/detail.cfm?id=1670144
>
> And started thinking what would be the best long term setup for a home
> server, given limited number of disk slots (say 10).
>
> I con
25 matches
Mail list logo