On 12/01/13 06:20, John Hynes wrote:
> OK, just to clarify:
> 
> The kernel is 5.3 with the official patches applied, no other modifications.
> 
> I read through the changes for 5.4 and certainly, there has been a ton of
> work done, and I will upgrade soon.  Nothing listed in the changes seems
> like it would directly address a problem like this, so I'd guess it's not a
> bug though.  It certainly *seems* like it could be a hardware problem
> that's just not throwing an error (yet).
> 
> So, I guess what I'm asking everyone is: Other than what I've done, what
> are some ways I can investigate this further to determine where the problem
> lies?  For example, let's say it *is* a failing hard drive in the softraid,
> and the system just hasn't failed the drive yet, because the operations
> still complete, just really slowly.  What tools/techniques could I use to
> see that a process is waiting for a disk operation that's taking forever to
> complete?
> 
> Thanks,
> 
> -John

I've got one of these machines (Sun X2100 M2).  I had performance
problems in the past with it, similar to what you described, simple
tasks which seemed to hang on disk I/O where disk I/O shouldn't have
been a problem.  I credited the problem to the nvidia chipset.

I did recently blow the dust off the machine and put 5.4-current on it,
and -- SO FAR -- it's running pretty darned well.

The way the disks are connected to the main board, if you swap the red
and blue SATA cables, you can route them to the PCIe slot rather than
the on-board SATA connectors, and you may be able to put a third-party
SATA controller to work with them (or you may not -- I made a VERY quick
attempt at this recently, thinking I'd maybe be able to get AHCI
performance out of it, and the thing booted the OpenBSD kernel but the
controller (which I have used elsewhere without issue) didn't initialize
properly, and so I had no disks after boot.  Upon disassembly, I found
the card was working its way loose, and it was late, so I didn't spend a
lot of time trying to figure out exactly what was wrong, and I just
switched back to my on-board SATA...which has been working fine for a
week or so now).

But really...it's an nvidia machine.  if it works at all, you should be
happy... I'd not trust it too far

That being said...there are some nasty disk failure modes I've seen more
than once, where a disk will start doing retries over and over until a
successful read takes place...and then it will go on to the next read,
with lots of retries ... etc.  The result is a painfully slow machine,
and it is somewhat hard to diagnose since the drive never returns an
error to the OS.  If you have disk activity lights for each disk, it's
actually trivial to see where the machine is hung, but this machine's
manufacturer doesn't feel that disk activity lights are useful (idiots.
Blame Sun this time).

Nick.

> 
> 
> 
> On Sat, Nov 30, 2013 at 9:39 PM, Kenneth R Westerback <
> kwesterb...@rogers.com> wrote:
> 
>> On Sat, Nov 30, 2013 at 07:04:44PM -0600, Shawn K. Quinn wrote:
>> > On Sat, Nov 30, 2013, at 03:55 PM, Kenneth R Westerback wrote:
>> > > On Sat, Nov 30, 2013 at 04:02:58PM -0500, John Hynes wrote:
>> > > > OpenBSD 5.3 (GENERIC.MP) #0: Fri Sep 13 04:11:52 EDT 2013
>> > > >     j...@hytronix-gw1.hytronix.com:/usr/src/sys/arch/amd64/compile/
>> > > > GENERIC.MP
>> > >
>> > > Try 5.4 or -current.
>> > >
>> > > Issues with non-home-compiled kernels are more interesting.
>> >
>> > I thought as long as it was an unmodified GENERIC or GENERIC.MP that the
>> > issue was still valid. Is this no longer the case?
>> >
>> > --
>> >   Shawn K. Quinn
>> >   skqu...@rushpost.com
>> >
>>
>> Sure - but if it's unmodified, why compile a new one? And John did
>> not state in his email that it was unmodified.
>>
>> .... Ken

Reply via email to