On 12/01/13 06:20, John Hynes wrote: > OK, just to clarify: > > The kernel is 5.3 with the official patches applied, no other modifications. > > I read through the changes for 5.4 and certainly, there has been a ton of > work done, and I will upgrade soon. Nothing listed in the changes seems > like it would directly address a problem like this, so I'd guess it's not a > bug though. It certainly *seems* like it could be a hardware problem > that's just not throwing an error (yet). > > So, I guess what I'm asking everyone is: Other than what I've done, what > are some ways I can investigate this further to determine where the problem > lies? For example, let's say it *is* a failing hard drive in the softraid, > and the system just hasn't failed the drive yet, because the operations > still complete, just really slowly. What tools/techniques could I use to > see that a process is waiting for a disk operation that's taking forever to > complete? > > Thanks, > > -John
I've got one of these machines (Sun X2100 M2). I had performance problems in the past with it, similar to what you described, simple tasks which seemed to hang on disk I/O where disk I/O shouldn't have been a problem. I credited the problem to the nvidia chipset. I did recently blow the dust off the machine and put 5.4-current on it, and -- SO FAR -- it's running pretty darned well. The way the disks are connected to the main board, if you swap the red and blue SATA cables, you can route them to the PCIe slot rather than the on-board SATA connectors, and you may be able to put a third-party SATA controller to work with them (or you may not -- I made a VERY quick attempt at this recently, thinking I'd maybe be able to get AHCI performance out of it, and the thing booted the OpenBSD kernel but the controller (which I have used elsewhere without issue) didn't initialize properly, and so I had no disks after boot. Upon disassembly, I found the card was working its way loose, and it was late, so I didn't spend a lot of time trying to figure out exactly what was wrong, and I just switched back to my on-board SATA...which has been working fine for a week or so now). But really...it's an nvidia machine. if it works at all, you should be happy... I'd not trust it too far That being said...there are some nasty disk failure modes I've seen more than once, where a disk will start doing retries over and over until a successful read takes place...and then it will go on to the next read, with lots of retries ... etc. The result is a painfully slow machine, and it is somewhat hard to diagnose since the drive never returns an error to the OS. If you have disk activity lights for each disk, it's actually trivial to see where the machine is hung, but this machine's manufacturer doesn't feel that disk activity lights are useful (idiots. Blame Sun this time). Nick. > > > > On Sat, Nov 30, 2013 at 9:39 PM, Kenneth R Westerback < > kwesterb...@rogers.com> wrote: > >> On Sat, Nov 30, 2013 at 07:04:44PM -0600, Shawn K. Quinn wrote: >> > On Sat, Nov 30, 2013, at 03:55 PM, Kenneth R Westerback wrote: >> > > On Sat, Nov 30, 2013 at 04:02:58PM -0500, John Hynes wrote: >> > > > OpenBSD 5.3 (GENERIC.MP) #0: Fri Sep 13 04:11:52 EDT 2013 >> > > > j...@hytronix-gw1.hytronix.com:/usr/src/sys/arch/amd64/compile/ >> > > > GENERIC.MP >> > > >> > > Try 5.4 or -current. >> > > >> > > Issues with non-home-compiled kernels are more interesting. >> > >> > I thought as long as it was an unmodified GENERIC or GENERIC.MP that the >> > issue was still valid. Is this no longer the case? >> > >> > -- >> > Shawn K. Quinn >> > skqu...@rushpost.com >> > >> >> Sure - but if it's unmodified, why compile a new one? And John did >> not state in his email that it was unmodified. >> >> .... Ken