Update. I had a friend with a power supply checker check the power
supply. The +12 volt line is running at 11.5v, barely within spec.
Fishy, but livable for the nonce.

I had been running Finnix as Debian wouldn't boot. I pulled all the
external USB lines except for a 3.5" external floppy disk drive. The
USB problem that started all this went away, i.e. I stopped getting
error messages in dmesg. I rebooted Finnix, and did not see the error
message. Also, the boot speed was back to normal.

An hour and a half ago I booted to the installed Debian system, and
that now runs fine. I see no error messages for USB or for any of the
drives. RAID looks nominal. I will monitor.

The next step will be to add back external USB lines, one at a time,
and see if the problem re-appears. It may have been something as simple
as mild corrosion on a connector.

Meanwhile, I did long tests on the hard drives and SSD; all passed.
I've seen no hard drive error messages all day.

On Mon, 25 Apr 2022 14:56:50 -0700
David Christensen <dpchr...@holgerdanske.com> wrote:

> On 4/25/22 07:18, Charles Curley wrote:
> > On Sun, 24 Apr 2022 22:52:15 -0700
> > David Christensen <dpchr...@holgerdanske.com> wrote:
> >   
> >> So, RAID 5 HDD's are sda, sdc, and sdd, and optical is sdb?  
> > 
> > Optical is sr0.   
> 
> 
> Interesting.  (Must be the SATA controller expansion card?)

Yup.

> 
>  > Debian and finnix see things differently. On both. sdc and sdd are
>  > part of the RAID array. On Debian, sda is the system drive: /,
>  > /home, /etc, swap, /boot, etc.. sdb is part of the RAID array.
>  > Finnix swaps those two.  
> 
> 
> Okay.
> 
> 
> Rather than a Live Linux distribution for troubleshooting, I install 
> Debian onto a USB flash drive (SanDisk Ultra Fit USB 3.0 16 GB).   I 
> keep it updated/ upgraded, and install whatever tools I want.  You
> might want to make one that matches your Debian instance -- that
> should eliminate the device enumeration differences.

Would that have solved this transposition? I suspect that Linux sets up
the boot drive as sda regardless of how the firmware sees things.

But otherwise a good idea, if a bit of work.


> 
> 
>  >> SATA cables -- Color?  Locking or non-locking connectors?    Came
>  >> with motherboard or aftermarket?  If the latter, make and model?  
>  >
>  > All black, two red. The black ones came with the computer. The red
>  > ones are aftermarket, Alchemy SATA3 30 cm. BFA-MSC-SATA330RK-RP.
>  > All lock.  
> 
> 
> https://www.frozencpu.com/products/14060/cab-572/Bitfenix_Alchemy_Multisleeve_SATA_30_Cable_-_30cm_-_Red_BFA-MSC-SATA330RK-RP.html
> 
> 
> Those Alchemy SATA cables look good.  Assuming you like them, I would 
> replace the factory cables with new Alchemy cables.

I'll keep that in mind. I have one spare, so I can swap that in in
pretty quick order if need be.


> 
> 
>  >> Please run long tests now on all three drives.  Save all of the
>  >> reports. Post the report(s) for any drive(s) with SMART failures
>  >> and/or dmesg(1) errors.  
>  >
>  > Long tests run about 10 hours. I did one overnight on sdd, and it
>  > reported no errors. No dmesg errors sine the ones I reported in the
>  > original email.  
> 
> 
> I launch SMART tests on all of the drives at the same time.  The 
> microcontroller in each drive runs the test for that drive 
> independently, so it is okay to run all the tests concurrently.

I did figure that out, and tested the three remaining drives. All
tested with no errors.



> 
> 
> I still don't see the source of the original dmesg(1) errors.  I
> would:

I've done some of these. But as I am not seeing any more hard drive
issues, I'm not going to spend any more time on the rest.


> 
> 7.  Boot memtest86+ and run overnight or longer.

Interestingly enough, memtest86 locks up right at the 4096 MB mark. It
also completely refuses to run on another system I have here. I'm going
to do some further testing, then report what I see in a new thread.


> 
> 
> If and when you have convinced yourself that all of the hardware is 
> good, then software is what remains.

Yup.

> 
> 
> Did you figure out what was preventing Debian from booting?  Is the 
> Debian instance fixed?

I think so; see above.

-- 
Does anybody read signatures any more?

https://charlescurley.com
https://charlescurley.com/blog/

Reply via email to