I got a story to tell.  I've been meaning to sit down and write it
up and send it to advocacy@, but since you bring it up aac(4), I'm
going to tell it here, as it is very much an aac(4) story.

Ingo Schwarze wrote:
> Hi Michael,
> 
>> aac0 at pci2 dev 1 function 0 "Adaptec ASR-2200S" rev 0x01:
>> Dell CERC-SATA apic 3 int 0 (irq 10)
> 
> Trash your aac(4) hardware and use softraid(4).

...or any other (supported) RAID manufacturer who takes your data
seriously.  Do yourself a favor.

As part of my day job, I manage the e-mail for about 30,000 people
scattered around North America.  The system we use for that e-mail
is a "canned appliance" -- a bundle of hardware and software which
is managed on a day-to-day basis by me, but has a company we can
fall back on for support when things break.  I'm not going to
mention the product's name, because I neither wish to endorse or
scare people away from it.  They have some limitations, but they
also accomplish some seriously incredible stuff with relatively
little day-to-day babysitting.  They are, overall, good people.
They've made some mistakes, but they do their darnedest to make
good on them.

The previous major version of this system was based on FreeBSD,
with their own mail transport system and various spam and virus
filtering systems.  A couple years ago, though, they announced
they were switching from FreeBSD to Linux as the base OS for
their product.  This really wasn't an issue for customers, as
they never get to see a Unix shell prompt...but it was
interesting to me, so I asked a few of their people about the
decision (with warning them first that I worked with the
OpenBSD project, so I wasn't a totally disinterested party :).

They told me, in short, they were "wanting to be in the e-mail
processing and delivery business, not the hardware device driver
writing business", and with FreeBSD, they seemed to have to do
too much driver development to get things to work as they desired,
and the drivers they were after were "just available" for Linux.

I.e., they wanted to pick the hardware and have the drivers
available, and someone else to support the OS (they went with
RedHat), rather than picking the best OS for the job and
selecting the best hardware that OS supported.

(They also claimed some performance benefits out of Linux that I
do not believe in the slightest, based on later experience.  They
also indicated that they were having trouble with third-party
antivirus vendors providing FreeBSD versions of their software,
they wanted to ship only Linux versions).

So, for their latest major release of the system, they have a
Linux based app with a bunch of hardware choices, all of it with
Adaptec RAID hardware.

One day as I'm walking into the office, our customer's rep
called me and said, "we got e-mail problems".  After a bit of
investigation, I found all the edge machines were wedged.
Rebooting them solved the problem.  The mail system
manufacturer looked at them and said, "Oh, looks like a problem
with the RAID card, upgrade to this new version, which is
supposed to fix this".

Ok, shit happens, and unfortunately, that's just accepted in
most non-OpenBSD parts of the computer world, so I shut down
one machine at a time and upgrade to the newest firmware.

WELL...the new firmware doesn't cause hangs, it causes random
reboots....  Isn't that special.

They tell me, "Yes, we've seen that recently.  Try this new,
newest firmware".  Guess what?  That one doesn't fix the
reboots, but NOW when the system spontaneously reboots, the
cache is mishandled and manages to corrupt the file systems
on the disks, so instead of a reboot and a few minutes of
non-productivity, you get a dead lump of a canned appliance
until you get in front of it, boot their magic remote repair
CD and a remote tech does an fsck of your file systems.

So they give me another NEW firmware.  That one seems to
(usually) fix the file system corruption, but still reboots,
and once in a while, trashes the file system.

(I do want to point out that they really had ZERO intent
of you EVER booting a firmware upgrade CD on these things.
They are supposed to be serial managed, no keyboard or VGA
monitor is ever supposed to be attached to them...until you
need to upgrade the firmeware...the hardware they have
actually supported console redirection, but since that was
all supposed to be handled by the OS, it is not turned on.
Ooops.)

For the last few weeks, I've been running a mix of different
firmware versions, just so I don't have another "come in and
all my mail servers are dead at once" day.  Today they asked
me to install a special firmware with debugging features so
hopefully Adaptec can figure out what is going wrong and
actually make it work correctly this time.

You think Theo is blowing smoke when he says Adaptec RAID
hardware has piles of horrible bugs?  HE'S NOT.  I think it
is very safe to say that your data is not Adaptec's priority.
They have a garbage product and garbage drivers and they try
to patch around it in any way they can OTHER than build it
right in the first place.  Don't go telling yourself this is
just "OpenBSD doesn't play nice, so they don't get good
drivers from Adaptec".  Linux plays nice with everyone, signs
any NDA and takes drivers under any conditions...and they get
crap, too...but they are content with it!  But remember: you
heard it from OpenBSD first.  I can't say I'd ever trust any
Adaptec RAID card with data on any OS after seeing this little
issue.


Punchline: I had a chat with one of the top techs at this
mail system provider, and told him about the OpenBSD
experience with Adaptec.  He told me they have come to the
same conclusion and that their next generation product would
have a much better (by OpenBSD standards) manufacturer for
the RAID systems...

Nick.

Reply via email to