supports user-specified device read priority order, nowadays? Takes broken disk out of use?

Tinker Mon, 15 Feb 2016 09:49:33 -0800

Dear Nick,

On 2016-02-15 05:29, Nick Holland wrote:

On 02/13/16 11:49, Tinker wrote:
Hi,
1)
http://www.openbsd.org/papers/asiabsdcon2010_softraid/softraid.pdfpage
3 "2.2 RAID 1" says that it reads "on a round-robin basis from all
active chunks", i.e. read operations are spread evenly across disks.

Since then did anyone implement selective reading based on experienced
read operation time, or a user-specified device read priority order?
That would allow Softraid RAID1 based on 1 SSD mirror + 1 SSD mirror +1HDD mirror, which would give the best combination of IO performanceand
data security OpenBSD would offer today.
I keep flip-flopping on the merits of this.
At one point, I was with you, thinking, "great idea! Back anexpensive,
fast disk with a cheap disk".

Currently, I'm thinking, "REALLY BAD IDEA".  Here's my logic:

There's no such thing as an "expensive disk" anymore.  A quick look

..

of "fast" storage to make their very few business apps run better.  No
question in their mind, it was worth it.  Now we do much more with our
computers and it costs much less.  The business value of our investment
should be much greater than it was in 1982.

And ignoring hardware, it is.  Companies drop thousands of dollars on
consulting and assistance and think nothing of it.  And in a major
computer project, a couple $1000 disks barely show as a blip on the

budget. Hey, I'm all about being a cheap bastard whenever possible,but

this just isn't a reasonable place to be cheap, so not somewhere I'd
suggest spending developer resources.


Also ... it's probably a bad idea for functional reasons.  You can't
just assume that "slower" is better than "nothing" -- very often, it's
indistinguishable from "nothing".  In many cases, computer systems that

perform below a certain speed are basically non-functional, as taskscan

pile up on them faster than they can produce results.  Anyone who has
dealt with an overloaded database server, mail server or firewall will
know what I'm saying here -- at a certain load, they go from "running
ok" to "death spiral", and they do it very quickly.

If you /need/ the speed of an SSD, you can justify the cost of a pairof

'em.  If you can't justify the cost, you are really working with a
really unimportant environment, and you can either wait for two cheap
slow disks or skip the RAID entirely.

How fast do you need to get to your porn, anyway?


I technically agree with you -

What lead me to think about SDD+HDD was the idea of having on the samemountpoint a hybrid-SSD-HDD storage where the "important stuff" would beautomatically in the SSD and the "less important" on the HDD.

This symmetry would mean that those two data sets could be stored withinone and the same directory structure, which would be really handy, andarchiving of unused files would be implicit.

I understand that ZFS is quite good at delivering this. LSI MegRaidcards are good at that as long as the "important stuff" is forever<512GB, which is not the case, duh.

This whole idea has a really exotic, unpredictable, ""stinking"" edge toit though. Your "slower" is generally as bad as "nothing" allegorycombined with the market price situation, makes all sense -


So, even if kind of unwillingly, I must agree with your reasoning.

(now ... that being said, part of me would love a tmpfs / disk RAID1,
one that would come up degraded, and the disk would populate the RAM
disk, writes would go to both subsystems, reads would come from the RAM
disk once populated.  I could see this for some applications like CVS
repositories or source directories where things are "read mostly", and

typically smaller than a practical RAM size these days, and as thereare

still a few orders of magnitude greater performance in a RAM disk than
an SSD and this will likely remain true for a while, there are SOME
applications where this could be nice)

Wait.. you mean you would like OpenBSD to implement read cache that is"100% caching agressive" rather than the current "buffer cache" whichhas "dynamic caching agressiveness" - I don't understand how this couldmake sense, can you please clarify?

2)
Also if there's a read/write failure (or excessive time consumptionfor
a single operation, say 15 seconds), will Softraid RAID1 learn to take
the broken disk out of use?


As far as I am aware, Softraid (like most RAID systems, hw or sw) will
deactivate a drive which reports a failure.  Drives which go super slow

(i.e., always manage to get the data BEFORE the X'th retry at whichtheywould toss an error) never report an error back, so never deactivatethe

drive.

Sound implausible?  Nope.  It Happens.  Frustrating as heck when you
have this happen to you until you figure it out.  In fact, one key
feature of "enterprise" and "RAID" grade disks is that when they hop
off-line and throw an error fast and early, to prevent this problem
(some "NAS" grade disks may do this.  Or they may just see your credit
limit hasn't been reached).

However, having done this for a looong time, and seen the problems from
both rapid-failure and "try and try" disks, I'll take the "try and try"
problem any day.  Happens a lot less often, and tends to be less
catastrophic when it happens (hint: you WILL be quickly fixing a disk

system which gets to be 100x slower than normal. You may not noticethe

first disk that fails and causes an array to be non-redundant until the
disk fails that takes the array down completely).

What I learn from what you say here, until if I would be provenotherwise by anyone else, is

1) The softraid subsystem will wait for underlying drives' IOoperations indefinitely.

Therefore, misbehavior in the form of ultra-long-running IOoperations and the like,

    will many times imply the same misbehavior in the softraid globally.

On the other hand, if disk that sends a specific complete-failuremessaging (such as goingoffline or sending the SMART command for telling that), it will behandled gracefully bythe softraid without any QoS impact, by the softraid simplydisconnecting the drive.

2) Enterprise drives uniquely happen to follow exactly this accesspattern, i.e. either they workperfectly, or in the case of any real issue they'll report themselfgracefully as broken

    by either just disconnecting, or sending the proper SMART command.

Therefore, I should always buy only enterprise-certified drives,e.g. Samsung PM863 or SM863 asSSD or "Seagate Enterprise Capacity 3.5 HDD 8TB 3.5" SATA-600" asHDD.

Finally, if the absolutely unexpected would happen and there'd be adumped-IO-throughput scenario for anysecondary reasons, then there must be some trigger to detect that, whichtakes the server out of use totally,

to undergo maintenance and that's all.


Cheers.
Tinker

Re: Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

Reply via email to